Historic,  Archive  Document 

Do  not  assume  content  reflects  current 
scientific  knowledge,  policies,  or  practices. 


U.S. 


Department  of  i Agriculture 


Hum<3n  Niitritioi  Infcrmation  Service » 


Abstract 


Response  rates  for  the  1987-88  Nationwide  Food  Consumption  Survey  (NFCS  1987-88)  were  very  iow--38  percent  at 
the  household  level  and  31  percent  at  the  individual  level.  This  publication  provides  information  on  the  procedures 
used  in  NFCS  1987-88,  the  response  rates,  and  the  characteristics  of  the  unweighted  sample  compared  with 
population  estimates  from  the  U.S.  Bureau  of  the  Census.  The  regression  weighting  approach  used  to  adjust  for 
nonresponse  is  described.  Analyses  done  to  evaluate  the  effect  of  nonresponse  on  estimates  of  food  and  nutrient 
intakes  in  NFCS  1987-88  are  presented.  Results  of  a  study  of  attrition  suggested  that  the  regression  weighting  may 
correct  nonresponse  bias.  The  study  showed  that  differences  between  respondents  and  nonrespondents  in  eating 
behavior  were  predictable  because  they  were  caused  by  known  socioeconomic  variables,  which  can  be  adjusted  for 
by  weighting,  and  were  not  caused  by  some  other  unknown  and  nonrandom,  and  thus  unpredictable,  response 
propensity.  Also,  a  comparison  of  NFCS  1987-88  with  the  NFCS  1977-78  and  the  1985  and  1986  Continuing  Survey 
of  Food  Intakes  by  Individuals  revealed  that  differences  in  results  appeared  to  be  caused  by  the  differences  in 
methodology,  design,  and  target  samples  rather  than  by  nonresponse.  Despite  the  low  response  rate,  the  NFCS 
1 987-88  provides  better  estimates  of  current  dietary  intake  than  does  the  NFCS  1 977-78.  Users  must  balance  their 
need  for  the  data  and  their  tolerance  for  error  against  the  limitations  of  the  data. 
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Introduction 


The  1987-88  Nationwide  Food  Consumption  Survey  (NFCS  1987-88)  is  the  most  recent  of  seven  decennial  surveys 
that  have  been  conducted  by  the  U.S.  Department  of  Agriculture  (USDA).  The  surveys  are  used  to  describe  food 
consumption  behavior  and  to  assess  the  nutritional  content  of  diets  of  Americans.  The  information  generated  by  these 
surveys  is  used  to  develop  policies  on  nutrition  education,  food  production  and  marketing,  food  assistance  programs, 
and  food  safety. 

NFCS  1987-88  included  the  collection  of  two  types  of  information  on  food  consumption:  (1 )  food  used  by  households 
during  a  7-day  period  and  the  cost  of  that  food  and  (2)  food  eaten  by  individuals  in  the  same  households  during  a 
3-day  period.  The  survey  was  conducted  by  National  Analysts,  a  division  of  Booz,  Allen,  and  Hamilton,  under  contract 
with  USDA.  The  household  data  and  the  first  day  of  individual  data  were  collected  in  personal,  in-home  interviews. 
The  second  and  third  days  of  individual  data  were  collected  by  a  self-administered  record. 

Data  collection  for  the  survey  began  on  April  1 , 1 987,  and  was  expected  to  be  completed  by  March  31,1 988.  Because 
of  low  response  rates  in  the  first  quarter  of  the  survey,  adjustments  were  made  to  increase  the  sample  size  in 
subsequent  quarters  and  data  collection  was  extended  into  August  1 988.  Efforts  to  increase  the  response  rates  were 
not  very  successful.  The  final  response  rates  were  very  low-38  percent  of  sample  households  that  were  occupied 
agreed  to  participate.  Within  these  participating  households,  81  percent  of  the  eligible  individuals  provided  at  least  1 
day  of  intake  data,  yielding  an  estimated  individual  response  rate  of  31  percent. 

The  Human  Nutrition  Information  Service  (HNIS)  contracted  with  the  Life  Sciences  Research  Office  (LSRO)  of  the 
Federation  of  American  Societies  for  Experimental  Biology  (FASEB)  to  conduct  an  independent  review  of  the  impact  of 
nonresponse  on  estimates  of  food  and  nutrient  intakes  in  the  NFCS.  LSRO  convened  a  panel  of  statisticians  (Expert 
Panel)  who  reviewed  the  design  and  execution  of  the  NFCS,  evaluated  studies  on  nonresponse  conducted  by  HNIS, 
and  made  recommendations  about  the  useability  of  the  data.  The  main  text  of  the  LSRO  report  is  an  appendix  to  this 
publication.  Many  of  the  tables  and  figures  that  appear  in  the  LSRO  report  appendixes  are  used  in  this  publication.^ 

This  publication  on  nonresponse  in  the  NFCS  1987-88  serves  two  purposes:  to  provide  information  on  the  data 
collection  procedures  used  in  NFCS  1 987-88,  the  response  rates,  and  a  discussion  of  the  weighting  approach  used  to 
adjust  for  nonresponse  (chapters  1,  2,  and  3);  and  to  describe  the  analyses  conducted  by  HNIS  to  evaluate  the  effect 
of  nonresponse  in  the  NFCS  (chapters  4  and  5). 


^The  complete  LSRO  report  can  be  obtained  from  FASEB  Special  Publications  Office,  9650  Rockville  Pike,  Bethesda, 
MD  20814.  The  cost  is  $24  (plus  5  percent  sales  tax  for  Maryland  residents). 


Chapter  1:  Background 

Katherine  S.  Tippett  and  Alvin  B.  Nowverl 

Human  Nutrition  Information  Service 

Sample  Design 

The  MFCS  1987-88  sample  was  designed  to  be  a  self-weighting,  multistage,  stratified,  area  probability  sample  of 
households  in  the  48  conterminous  States.  The  sampling  frame  was  organized  using  estimates  of  the  U.S.  population 
in  1980.  Adjustments  were  made  to  the  sampling  frame  at  the  time  of  the  survey  to  reflect  the  1987  population.  The 
target  sample  was  6,000  households;  these  households  were  expected  to  yield  about  1 5,000  individuals.  NFCS  1 987- 
88  included  two  samples-a  basic  sample  of  households  at  all  levels  of  income  and  a  low-income  sample  of 
households  with  incomes  at  or  below  130  percent  of  the  poverty  thresholds  provided  by  the  U.S.  Bureau  of  the 
Census.  This  report  on  nonresponse  covers  the  basic  sample  only. 

The  stratification  plan  took  into  account  geographic  location,  degree  of  urbanization,  and  socioeconomic 
considerations.  Each  successive  sampling  stage  selected  smaller,  more  specific  locations.  The  48  States  were 
grouped  into  the  nine  census  geographic  divisions.  Then  all  land  areas  within  the  divisions  were  divided  into  three 
urbanization  classifications:  metropolitan  central  city,  metropolitan  noncentral  city  (suburban),  and  nonmetropolitan. 
Thus,  all  cities  and  counties  in  the  conterminous  United  States  were  classified  into  27  superstrata. 

The  27  superstrata  were  further  divided  into  60  strata,  which  correspond  to  the  geographic  distribution,  urbanization, 
and  density  of  the  population  within  the  conterminous  United  States.  The  average  size  of  these  strata  was 
approximately  4  million  persons.  Smaller,  relatively  homogeneous  units,  called  primary  sampling  units  (PSU),  were 
formed  by  counties  or  combining  counties  in  nonmetropolitan  strata,  by  cities  or  parts  of  cities  in  central  city  strata,  and 
by  counties  or  the  balance  of  counties  having  central  cities  in  suburban  strata.  Two  PSU's  were  selected  for  each  of 
the  60  strata.  The  two  PSU's  were  selected  from  each  stratum  with  replacement;  that  is,  the  selection  of  one  PSU  did 
not  preclude  its  selection  as  the  second  PSU.  Because  one  PSU  drawn  into  the  sample  was  completely  lost  to 
nonresponse,  the  final  sample  included  119  PSU's. 

Each  selected  PSU  was  divided  geographically  along  census  boundaries  into  smaller  clusters,  known  as  area 
segments,  containing  a  minimum  of  100  housing  units.  These  segments  usually  consisted  of  one  or  more  city  blocks 
in  urban  areas  and  part  of  a  census  enumeration  district  elsewhere.  A  total  of  1,000  area  segments  were  drawn  into 
the  sample  across  all  PSU's  to  maximize  spread  of  interviews  in  the  PSU,  to  create  efficient  interviewer  workloads,  and 
to  target,  on  average,  six  interviewed  households  per  area  segment. 

The  1 ,000  area  segments  were  prelisted  prior  to  the  sun/ey  to  identify  the  existing  housing  units  within  the  area 
boundaries.  The  prelisted  number  of  housing  units  in  the  area  as  of  1987,  together  with  estimates  of  occupancy  and 
completion  rates,  served  as  the  basis  for  determining  the  number  of  housing  units  to  be  selected  for  the  sample  from 
that  area. 

NFCS  1987-88  was  to  include  a  followup  survey  of  nonresponding  households  to  determine  some  of  their 
characteristics;  however,  this  survey  was  not  conducted. 

More  detail  on  the  sample  design  of  NFCS  1987-88  is  available  in  appendix  A  of  the  LSRO  report  (^f  and  in  NFCS 
Report  No.  87-1-1,  which  gives  resufts  of  the  individual  intake  component  (2). 


Data  Collection 

To  contact  individuals  in  housing  units  selected  as  part  of  the  sample,  interviewers  made  a  minimum  of  three  personal 
visits  plus  up  to  eight  telephone  calls  to  each  household  having  a  telephone.  To  contact  households  without 
telephones,  interviewers  increased  the  number  of  personal  visits,  when  necessary,  to  six  (five  in  rural  areas). 
Interviewers  were  expected  to  make  up  to  six  call-back  attempts  in  urban  areas  and  five  in  rural  areas.  These 
attempts  were  to  be  on  different  days  of  the  week  with  at  least  one  attempt  on  a  weekend.  The  day  was  divided  into 


^Numbers  in  parentheses  refer  to  references  in  the  section  "Literature  Cited." 
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four  parts  (morning,  early  afternoon,  late  afternoon,  and  evening),  and  attempts  to  contact  a  sample  household  were  to 
be  made  during  each  of  these  periods,  if  needed. 

The  interviewer  asked  to  speak  with  the  person  most  responsible  for  planning  or  preparing  meals  and  provided  this 
person  with  a  letter  of  introduction  and  described  the  survey.  The  respondent  was  asked  to  save  all  food-purchase 
receipts  for  the  next  7  days  along  with  food  labels,  recipes,  and  other  reminders  of  foods  served  over  the  period.  The 
interviewer  returned  7  days  later  to  conduct  the  survey  with  the  aid  of  a  laptop  computer. 

A  list-aided  recall  method  was  used  to  collect  the  types,  amounts,  and  costs  (if  purchased)  of  all  the  foods  used  by  the 
household  during  the  previous  7  days.  Then  the  interviewer  completed  a  1-day  recall  of  food  intake  for  each 
household  member  present. 

The  main  meal  planner/preparer  was  asked  to  report  for  any  children  under  the  age  of  12  and  for  absent  members  of 
the  household.  If  the  main  meal  planner/preparer  could  not  supply  the  information,  the  recall  form  was  left  at  the 
household  to  be  reviewed  or  completed  by  the  absent  person. 

The  interviewer  then  described  the  2-day  dietary  record  and  helped  each  household  member  begin  a  record  of  the 
current  day's  intake.  The  interviewer  returned  2  to  4  days  later  to  collect  and  review  the  2-day  records  and  distribute 
the  monetary  incentives  of  $2  per  completed  3-day  recall-plus-record  set. 

The  survey  design  called  for  all  sample  housing  units  selected  in  each  quarterly  sample  to  be  contacted,  and 
interviews  completed,  during  the  designated  3-month  period.  As  the  fieldwork  progressed,  it  became  apparent  that 
this  goal  could  not  be  achieved.  Successful  resolutions  of  contacts  with  sample  households  were  not  being  obtained 
during  the  designated  3-month  period.  It  was  therefore  decided  to  continue  attempting  contacts  and  interviews  with 
sample  housing  units  beyond  the  initial  period.  Further,  because  of  the  lower  than  anticipated  response  rate  in  the  first 
quarter,  the  sampling  rate  was  increased  in  the  second,  third,  and  fourth  quarters.  Table  1  provides  information  on 
participation  levels  for  each  quarterly  sample. 

Response  Rates 

The  response  rates  were  very  low.  Only  38  percent  of  targeted  occupied  housing  units  participated  (table  2).  Some 
participating  households  did  not  provide  complete  food  use  information;  therefore,  the  final  response  rate  for  the 
household  component  of  the  NFCS  was  37  percent.  Response  rates  for  the  individual  intake  component  of  the  survey 
were  calculated  separately.  Of  the  individuals  living  in  participating  households,  81  percent  completed  the  day  1  intake 
(an  estimated  overall  response  rate  of  31  percent);  and  83  percent  of  those  individuals  who  completed  the  first  day  of 
intake  completed  all  3  days,  yielding  an  estimated  overall  3-day  response  rate  of  25  percent. 

Reasons  for  nonresponse  were  numerous.  Interviewers  failed  to  contact  about  17  percent  of  sample  households,  14 
percent  of  contacted  households  refused  to  be  screened,  and  45  percent  of  those  screened  refused  to  participate  in 
the  interview.  A  high  rate  of  turnover  of  interviewers,  interviewers'  failure  to  follow  prescribed  schedules,  insufficient 
training  and  monitoring  of  the  interviewers,  and  less-than-effective  screening  and  interview  techniques  all  contributed 
to  the  poor  response  rates  (1 ). 

The  length  of  the  interview,  which  averaged  2.7  hours,  also  contributed  to  the  low  response.  Many  people  refused  to 
participate  after  being  informed  of  the  requirements  of  the  survey.  Other  suspected  reasons  for  the  low  response  rate 
include  an  increase  in  the  proportion  of  women  who  are  in  the  work  force,  and  thus  are  less  likely  to  be  home  or  to  be 
willing  to  devote  time  to  a  long  interview;  the  increased  number  of  surveys  by  many  types  of  organizations;  greater 
concern  about  letting  strangers  into  the  home;  and  a  lack  of  worthwhile  incentives  to  participate. 

Nonresponse  Adjustments 

As  the  level  of  nonresponse  became  apparent,  HNIS  staff  decided  that  traditional  nonresponse  adjustments  based 
only  on  geography  and  the  small  number  of  variables  available  from  census  data  (age,  sex,  race,  and  income)  could 
not  adequately  correct  for  potential  nonresponse  bias.  The  analysis  of  the  unweighted  NFCS  sample  presented  in 
chapter  2  confirmed  that  other  variables  such  as  household  composition  and  employment  status  should  be  addressed. 
A  weighting  approach,  developed  at  Iowa  State  University,  that  controls  for  additional  variables  in  a  regression  analysis 


was  employed  to  reduce  potential  nonresponse  bias.  The  demographic  variables  controlled  for  are  listed  in  chapter  2. 
In  addition  to  these  variables,  day  of  the  week  and  month  of  the  interview  were  controlled  because  these  temporal 
characteristics  were  seriously  unbalanced.  If  these  variables  were  not  controlled,  biased  estimates,  unrelated  to 
nonresponse,  could  result.  Construction  and  efficiency  of  weighting  factors  are  discussed  in  chapter  3. 


Table  1.  Participation  levels  for  the  1987-88  Nationwide  Food  Consumption  Survey,  basic  sample, 
by  quarter 


Quarter 
1 

Quarter 
2 

Quarter 
3 

Quarter 
4 

Total 

Housing  units  in 
sampling  frame 

2,187 

3,055 

4,677 

3.814 

13,733 

Occupied  housing  units 

1,947 

2.702 

4,142 

3,390 

12,181 

Contacted  households 

1,649 

2,132 

3,398 

2.756 

9,935 

Screened  households 

1,393 

1,855 

2,860 

2,342 

8,450 

Participating  households 

847 

1.032 

1,540 

1,170 

4,589 

Households  completing 
household  component 

822 

1,011 

1,503 

1.159 

4,495 

Households  participating  in 
individual  component 

756 

920 

1,398 

1,040 

4,114 

Individuals  in  participating 
households 

2.291 

2,729 

4,264 

3,238 

12,522 

Individuals  completing  day  1 

1,860 

2.229 

3,507 

2,576 

10,172 

Individuals  completing  3  days 

1.597 

1,901 

2,925 

2,045 

8,468 

Reason  occupied  housing  units 
not  contacted: 
No  one  home;  no  answer 
No  access 

292 
6 

538 

32 

702 
42 

583 
51 

2,115 
131 

Reason  contacted  households 
not  screened: 
Refused  screening 
Language  barrier 

248 
8 

251 
26 

492 
46 

379 
35 

1,370 
115 

Reason  screened  households 
did  not  participate: 
Refused  interview 
Other 

537 
9 

789 
34 

1.264 
56 

1,112 
60 

3,702 
159 
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Table  2.  Response  rates  in  the  1987-88  Nationwide  Food  Consumption  Survey,  basic  sample 


Housing  units  selected 
Occupied  housing  units 
Contacted  households 
Screened  households 

Participating  households 


Households  with  completed 
food  use  questionnaires 


Individuals  in  partici- 
pating households 

Individuals  completing 
day  1  recall 


13,733 
12,181 
9,935 
8,450 


(89%  of  housing  units  selected) 

(82%  of  occupied  housing  units) 

(69%  of  occupied  housing  units; 
85%  of  contacted  households) 


4,589  (38%  of  occupied  housing  units; 
54%  of  screened  households) 

4,495  (98%  of  participating  households; 
37%  of  occupied  housing  units; 
53%  of  screened  households) 

12,522 


10,172  (81%  of  individuals  in  participating 

households;  estimated  31%  of  individuals 
in  all  occupied  housing  units) 


Individuals  completing  8,468 
3  days  recall/records 


(83%  of  individuals  completing  day  1  recall; 
68%  of  individuals  in  participating 
households;  estimated  25%  of  individuals 
in  ail  occupied  housing  units) 


Chapter  2.  Demographic  Characteristics  of  the  Sample 
Phillip  S.  Kott,  National  Agricultural  Statistics  Service 
Patricia  M.  Guenther,  Human  Nutrition  Information  Service 


Although  a  survey  sample  may  be  carefully  designed  to  include  all  segments  of  a  population  of  interest,  some 
proportion  of  the  selected  sample  typically  will  fail  to  respond,  as  in  NFCS  1987-88.  If  nonresponse  is  random 
throughout  the  selected  sample,  the  respondents  can  still  be  expected  to  represent  the  population  of  interest. 
However,  if  respondents  and  nonrespondents  have  systematically  different  food  consumption  patterns,  then  the 
respondents  may  not  represent  the  target  population,  possibly  biasing  the  survey  results.  The  magnitude  of  this  bias  is 
determined  by  the  overall  response  rate  and  the  level  of  difference  between  the  mean  values  of  survey  variables  for 
respondents  and  the  mean  values  of  survey  variables  for  nonrespondents.  Unfortunately,  because  nonrespondents  did 
not  respond,  there  is  no  information  on  what  they  ate  and,  therefore,  an  assessment  of  nonresponse  bias  is  difficult. 

It  is  possible  to  compare  characteristics  of  respondents  with  population  estimates  from  other  sources.  This  section 
contains  a  comparison  of  the  demographic  characteristics  of  those  responding  to  NFCS  1987-88  with  estimates  from 
the  1987  Current  Population  Survey  (CPS),  March  Supplement,  conducted  by  the  U.S.  Bureau  of  the  Census  for  the 
Bureau  of  Labor  Statistics  (3).  The  CPS  contains  data  which  are  used  to  estimate  demographic  characteristics  of  the 
U.S.  population.  Since  research  and  experience  have  shown  that  food  consumption  patterns  relate  in  part  to  various 
sociodemographic  variables,  a  comparison  of  distributions  of  sociodemographic  variables  in  NFCS  and  CPS  permits 
some  indirect  inferences  regarding  nonresponse  bias  that  would  result  if  the  unweighted  NFCS  data  were  used  for 
analysis. 

Demographic  characteristics  from  households  and  individuals  participating  in  NFCS  1987-88,  prior  to  weighting,  were 
compared  with  characteristics  of  the  general  population  as  estimated  by  the  1987  CPS.  Since  the  original  NFCS 
sample  was  designed  to  be  self-weighting,  the  unweighted  J>4FCS  results  and  the  CPS  estimates  could  have  been 
expected  to  agree  within  sampling  error  if  there  had  been  complete  response. 

The  characteristics  chosen  for  the  comparison  have  been  shown  by  research  and  experience  to  be  related  to  food 
consumption  (4,  5,  6,  7).  Fourteen  characteristics  were  compared  for  the  household  component  and  1 3  for  the 
individual  intake  component;  these  characteristics  are  listed  in  tables  3  and  4. 

The  individual  intake  analysis  was  done  separately  for  three  sex-age  groups:  men  20  years  of  age  and  over,  women 
20  years  of  age  and  over,  and  persons  under  20  years  of  age.  Tables  3  and  4  provide  the  NFCS  sample  percents,  the 
population  percents  as  estimated  from  the  CPS,  and  the  absolute  t  values  for  both  the  household  and  the  individual 
analyses.  All  differences  between  the  sample  and  the  population  proportions  are  less  than  8  percentage  points. 
However,  there  are  16  individual  differences  and  4  household  differences  with  absolute  t  values  greater  than  3.0. 
(Each  t  value  is  a  NFCS  sample  proportion  minus  the  corresponding  CPS  population  proportion  divided  by  the  sample 
proportion's  estimated  standard  error  using  RTIFREQS  (8).  Because  the  variance  of  the  CPS  estimate  is  ignored,  this 
t  value  is  slightly  inflated.)  Focusing  on  absolute  t  values  greater  than  3.0  rather  than  the  more  conventional  bound  of 
2.0  (or  1 .96)  provides  a  crude  adjustment  for  the  number  of  comparisons  being  made. 

The  statistical  analysis  suggests  that  more  than  random  chance  led  to  the  sample  having  the  following  characteristics 
relative  to  the  CPS  population  estimates: 

Households:  (1) 

(2) 

(3) 

Individuals:  (1) 

(2) 
(3) 


A  smaller  proportion  from  high-income  households 
A  larger  proportion  of  households  with  both  a  male  and  a  female  head 
and  a  smaller  proportion  of  households  with  a  male  head  only 
A  larger  proportion  of  households  with  exactly  two  adults 

A  larger  proportion  of  individuals  from  low-income 
households  and  a  smaller  proportion  from  high-income 
households 

A  larger  proportion  of  individuals  from  households  with 
exactly  two  adults 

A  smaller  proportion  of  women  from  households  with 
working  female  heads 
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Individuals  (continued):  (4)  A  smaller  proportion  of  men  from  households  with  working  male  heads 

(5)  A  smaller  proportion  of  men  and  women  from  households 
with  a  female  head  under  41  years  of  age  and  no  children 

(6)  Smaller  proportions  of  individuals  20  to  24  years  of  age  and 
15  to  19  years  of  age 

These  results  suggest  that  an  analysis  of  unweighted  NFCS  data  could  be  seriously  biased  because  of  differences 
between  the  sample  and  its  target  population  in  characteristics  believed  to  be  related  to  food  consumption.  However, 
use  of  the  weighting  factors  discussed  in  the  next  chapter  should  reduce,  and  perhaps  even  eliminate,  these  potential 
sources  of  bias. 


Table  3.  NFCS  1987-88  household  component:  Comparisons  of  the  unweighted  sample  (NFCS)  and 
population  (CPS)  characteristics,  1987 


Number  in  Percent  Percent 

Characteristic  sample  of  sample     of  population  1 1  value  | 


Region: 

Northeast  905  20.1  21.2  0.3 

fvlidwest  1,172  26.1  24.7  .3 

South  1,567  34.9  34.4  .1 

West  851  18.9  19.6  .2 

Urbanization: 

Central  City  1,064  23.7  31.2  1.7 

Suburban  2,122  47.2  46.0  .2 

Nonmetro  1,309  29.1  22.9  1.3 

Household  income 
as  a  percentage  of 
poverty  level: 

<131%  1,041  23.2  20.0  2.2 

131-300%  1,564  34.8  32.2  2.4 

301-500%  1,108  24.6  25.9  1.3 

over  500%  782  17.4  21.8  3.4 

Household  presently 
receiving  food  stamps: 

Yes  314  7.0  7.4  .6 

No  4,181  93.0  92.6 

Owns  domicile: 

Yes  2,998  66.7  64.1  1.5 

No  1,497  33.3  35.9 

Race  of  household 
head: 

Black  519  11.5  11.1  .3 

Nonblack  3,976  88.5  88.9 

Age  of  household  head: 

<25  338  7.5  7.9  .5 

25-39  1,588  35.3  36.1  .8 

40-59  1,369  30.5  30.5  .0 

60-69  660  14.7  13.0  2.6 

70+  540  12.0  12.6  .8 


Continued 


Table  3.  MFCS  1987-88  household  component:  Comparisons  of  the  unweighted  sample  (MFCS)  and 
population  (CPS)  characteristics,  1987 — continued 


Number  in  Percent  Percent 

Characteristic  sample  of  sample      of  population  1 1  value  | 


Household  head  status: 
Both  male  and 

and  female  3,057  68.0  60.8  6.2 

Female  only  1,044  23.2  26.0  2.7 

Male  only  394  8.8  13.2  8.6 

Female  head  worked 
last  week: 

Yes  1,792  39.9  41.5  1.5 

No  2,703  60.1  58.5 

Exactly  one  adult 
in  household: 

Yes  1,211  26.9  29.7  2.6 

No  3,284  73.1  70.3 

Exactly  two  adults 
in  household: 

Yes  2,616  58.2  54.2  4.1 

No  1,879  41.8  45.8 

Presence  of  child 
under  age  7: 

Yes  1,009  22.4  20.1  2.9 

No  3,486  77.6  79.9 

Presence  of  child 
age  7  to  17: 

Yes  1,309  29.1  26.5  2.8 

No  3,186  70.9  73.5 

Household  size,  mean:  2.7  2.6  1.6 

Household  size  squared, 

mean:  9.5  9.1  1.2 
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Table  4.  NFCS  1987-88  individual  intake  component:  Comparisons  of  the  unweighted  sample  (MFCS)  and 
population  (CPS)  characteristics,  by  sex-age  category,  1987 


Characteristic 


Number  in 
sample 


Percent 
of  sample 


Percent 
of  population 


t  value 


(a)  Men  20  years  old  and  over 


Region: 
Northeast 
Midwest 
South 
West 


Presence  of  child 
under  age  7: 

Yes 

No 

Presence  of  child 
age  7  to  17: 

Yes 

No 

Exactly  one  adult 
in  household: 

Yes 

No 

Exactly  two  adults 
in  household: 

Yes 

No 

Household  member 
receives  food  stamps: 

Yes 

No 

Owns  dwelling: 
Yes 
No 

Male  head  worked 
last  week: 
Yes 
No 

Female  head  worked 
last  week: 
Yes 
No 


664 
813 
1,105 
576 


Household  income  as  percentage 

of  poverty  level: 

<131%  547 

131-300%  1,101 

301-500%  863 

over  500%  647 


696 
2,462 


912 
2,246 


317 
2,841 


2,101 
1,057 


121 
3,037 


2,294 
864 


2,123 
1,035 


1,217 
1,941 


21.0 
25.7 
35.0 
18.2 


17.3 
34.9 
27.3 
20.5 


22.0 
78.0 


28.9 
71.1 


10.0 
90.0 


66.5 
33.5 


3.8 
96.2 


72.6 
27.4 


67.2 
32.8 


38.5 
61.5 


21.2 
24.4 
34.1 
20.3 


12.6 
31.3 
29.2 
26.8 


20.7 
79.3 


27.7 
72.3 


11.7 
88.3 


59.8 
40.2 


4.5 
95.5 


70.2 
29.8 


72.5 
27.5 


41.7 
58.3 


0.0 
.3 
.2 
.5 


3.9 
2.6 
1.6 
4.0 


1.4 


1.1 


2.6 


6.7 


1.4 


1.5 


4.4 


2.5 


Continued 


Table  4.  NFCS  1987-88  individual  intake  component:  Comparisons  of  the  unweighted  sample  (NFCS)  and 
population  (CPS)  characteristics,  by  sex-age  category,  1987 — continued 


Number  in  Percent  Percent 

Characteristic  sample  of  sample         of  population         1 1  value  | 


(a)  Men  20  years  old  and  over 


Female  head  under  age 
41  and  no  child 
under  age  1 8: 

Yes 

No 


269 
2,889 


8.5 
91.5 


12.2 
87.8 


6.2 


Race: 
Black 
Nonblack 


257 
2,901 


8.1 
91.9 


10.1 
89.9 


1.2 


Age: 
20-24 
25-39 
40-59 
60-69 
70-1- 


280 
1,148 
970 
433 
327 


8.9 
36.4 
30.7 
13.7 
10.4 


11.7 
37.5 
29.8 
11.9 
9.0 


4.8 
1.2 
.9 
2.6 
1.7 


(b)  Women  20  years  old  and  over 


Region: 
Northeast 
Midwest 
South 
West 


826 
988 
1,424 
729 


20.8 
24.9 
35.9 
18.4 


21.7 
24.5 
34.1 
19.7 


0.2 
.1 
.3 
.3 


Household  income 
as  percentage  of 
poverty  level: 

<  131% 

131-300% 

301-500% 

over  500% 


934 
1,429 
959 
645 


23.5 
36.0 
24.2 
16.3 


19.1 
32.4 
26.5 
22.0 


2.8 
3.3 
2.3 
4.4 


Presence  of  child 
under  age  7: 
Yes 
No 


922 
3,045 


23.2 
76.8 


22.1 
77.9 


1.1 


Presence  of  child 
age  7  to  17: 

Yes 

No 


1,186 
2,781 


29.9 
70.1 


29.3 
70.7 


Exactly  one  adult 
in  household: 

Yes 

No 


808 
3,159 


20.4 
79.6 


19.9 
80.1 


.5 


Exactly  two  adults 
in  household: 

Yes 

No 


2,360 
1,607 


59.5 
40.5 


55.5 
44.5 


3.7 


Continued 
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Table  4.  MFCS  1987-88  individual  intake  component:  Comparisons  of  the  unweighted  sample  (NFCS)  and 
population  (CPS)  characteristics,  by  sex-age  category,  1987 — continued 


Number  in  Percent  Percent 

Characteristic  sample  of  sample         of  population         1 1  value  j 


(b)  Women  20  years  old  and  over 


Household  member 
receives  food  stamps: 

Yes 

No 

Owns  dwelling: 
Yes 
No 

Male  head  worked 
last  week: 

Yes 

No 


286 
3,681 


2,734 
1,233 


1,993 
1,974 


7.2 
92.8 


68.9 
31.1 


50.2 
49.8 


7.7 
92.3 


68.1 
31.9 


50.9 
49.1 


.6 


.5 


Female  head  worked 
last  week: 
Yes 
No 


1,720 
2,247 


43.4 
56.6 


50.6 
49.4 


5.6 


Female  head  under  age 
41  and  no  child 
under  age  18: 

Yes 

No 


444 
3,523 


11.2 
88.8 


18.4 
81.6 


9.0 


Race: 
Black 
Nonblack 


473 
3,494 


11.9 
88.1 


11.4 
88.6 


.3 


Age: 
20-24 
25-39 
40-59 
60-69 
70  + 


344 
1,384 
1,153 
590 
496 


8.7 
34.9 
29.1 
14.9 
12.5 


11.2 
35.0 
28.7 
12.5 
12.6 


4.2 
.1 
.3 

3.4 
.1 


(c)  Persons  under  20  years  old 


Region: 
Northeast 
Midwest 
South 
West 


585 
853 
992 
617 


19.2 
28.0 
32.6 
20.2 


19.3 
25.3 
34.6 
20.9 


0.0 
.5 
.4 
.1 


Household  income 
as  a  percent  of 
poverty  level: 

<  131% 

131-300% 

301-500% 

over  500% 


948 
1,236 
606 
257 


31.1 
40.6 
19.9 
8.4 


26.3 
37.7 
24.3 
11.8 


2.2 
1.6 
3.2 
3.0 


Continued 


Table  4.  NFCS  1987-88  individual  intake  component:  Comparisons  of  the  unweighted  sample  (NFCS)  and 
population  (CPS)  characteristics,  by  sex-age  category,  1987 — continued 


Number  in  Percent  Percent 

Characteristic  sample  of  sample         of  population         1 1  value  | 


(c)  Persons  under  20  years  old 

Presence  of  child 
under  age  7: 

Yes  1,741                      57.1  54.2  2.0 

No  1,306                      42.9  45.8 

Presence  of  child 
age  7  to  17: 

Yes  2,205                      72.4  72.5  .1 

No  842                      27.6  27.5 

Exactly  one  adult 
in  household: 

Yes  427                      14.0  13.9  .1 

No  2,620                      86.0  86.1 

Exactly  two  adults 
in  household: 

Yes  2,113                      69.3  62.7  4.8 

No  934                     30.7  37.3 

Household  member 
receives  food  stamps: 

Yes  426                      14.0  14.8  .5 

No  2,621                      86.0  85.2 

Owns  dwelling: 

Yes  1,995                      65.5  64.3  .5 

No  1,052                       34.5  35.7 

Male  head  worked 
last  week: 

Yes  2,155                     70.7  68.9  1.1 

No  892                      29.3  31.1 

Female  head  worked 
last  week: 

Yes  1,421                      46.6  49.7  1.8 

No  1,626                       3.4  50.3 

Female  head  under  age 
41  and  no  child 
under  age  18: 

Yes  26                        0.9  1.0  .9 

No  3,021                      99.1  99.0 

Race: 

Black  425                     13.9  15.6  .7 

Nonblack  2,622                     86.1  84.4 

Age: 

0-4  840                      27.6  26.1  1.4 

5-9  820                      26.9  25.0  2.4 

10-14  703                      23.1  23.4  .4 

15-19  684                       22.4  25.5  3.2 
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Chapter  3:  Construction  of  Weighting  Factors 
Wayne  A.  Fuller,  Marie  M.  Loughin,  and  Harold  D.  Baker 
Iowa  State  University 


The  objective  of  weighting  is  to  mal^e  the  sample  more  nearly  representative  of  the  population.  Because  the  Current 
Population  Survey  (CPS)  provides  estimates  of  demographic  characteristics  for  the  population,  it  can  be  used  to 
improve  estimates  derived  from  the  NFCS  1987-88  data.  One  way  to  do  this  is  to  compute  a  weight  for  each 
observation.  For  example,  assume  it  is  known  that  1 5  percent  of  the  population  is  in  category  A,  but  only  1 2  percent  of 
the  sample  falls  in  the  category.  Then  it  is  reasonable  to  assign  larger  weights  to  individuals  in  category  A  and  smaller 
weights  to  those  not  in  A,  so  that  the  estimated  fraction  for  category  A  constructed  with  the  weights  is  the  population 
fraction  of  15  percent.  In  a  similar  manner,  if  the  population  mean  of  a  variable  is  known  and  if  the  sample  mean  is 
smaller  than  the  population  mean,  the  procedure  we  adopt  assigns  larger  weights  to  large  observations  and  smaller 
weights  to  small  observations  so  that  the  weighted  sample  average  becomes  the  population  mean. 

Two  common  methods  of  computing  weights  using  known  totals  of  auxiliary  variables  are  post  stratification  and 
regression.  In  post  stratification,  the  population  is  divided  into  a  number,  say  k  +  1,  of  mutually  exclusive  and 
exhaustive  cells  where  the  population  number  in  each  cell  is  known  or  estimated  from  another  source.  The  sample  is 
partitioned  into  the  same  cells.  If  the  original  sample  is  self-weighting,  the  weight  for  an  observation  is  the  population 
number  divided  by  the  sample  number  for  the  cell  in  which  the  observation  appears. 

In  regression  estimation,  a  row  vector  of  variables  denoted  by  Xj  is  available  for  the  i-th  individual,  and  the  population 

totals  X  =(Xi,X2  X  i^)  for  the  vector  are  known  or  estimated  from  another  source.  A  set  of  weights  w  = 

(Wi,W2,...,Wn)'  is  chosen  to  minimize  a  function  of  the  weights,  say  g(w),  subject  to  the  restrictions 

n 

H  WjXjj  =  Xj,  j=  1,2,...,k, 
1=  1 

where  Xy  is  the  value  of  characteristic  j  for  individual  i,  and  X  j  is  the  population  total  for  characteristic  j.  Post 
stratification  is  a  special  case  of  regression  estimation  in  which  the  vector  Xj  is  composed  of  k  indicator  variables  for  k 
of  the  k  +  1  post  strata.  Regression  estimation  is  discussed  by  Cochran  (9),  Bethlehem  and  Keller  (10),  and  Deville 
and  Sarndal  (11).  The  regression  method  of  weight  construction  was  chosen  for  NFCS  1987-88  because  of  its 
generality  and  flexibility. 

In  ordinary  regression  estimation  for  a  simple  random  sample,  it  is  common  to  minimize 

n 

g(w)  =  S 

i  =  1  ' 

to  obtain  the  ordinary  regression  estimator,  if  the  original  sample  is  self-weighting,  the  ordinary  regression  estimator  of 
the  total  of  a  characteristic  Y  can  be  written  as 


or  as 


Y  =  N[y  +{X  ..-X..)  p] 


n 


Y  =  Z  WjYj, 
i=  1 


where 


n 

p  =   E  (Xi  -  x  .y  (X,.-  X  .) 

i  =  1 


-1 


(X,.-x..)'  (Yi.-  y.), 


i  =  1 


n 

(X    ,7  )=  n-^     S  (X,  ,  Y,), 
i  =  1 


-1 


W|  =  n~U(X..-  X  ) 


9  =  1 


(Xg  X..r(Xg.-X  .) 


(Xi.  -  x..r 


X..  =  NX.., 


and  N  is  the  number  of  elements  in  the  population. 


The  weights  for  the  regression  estimator  of  the  total  have  the  following  desirable  properties: 

1 .  The  weights,  once  computed,  can  be  applied  to  all  y-characteristics. 

2.  The  sample  weights  applied  to  the  x-characteristics  yield  the  true  total  of  x. 

3.  The  sum  of  the  sample  weights  for  estimating  the  total  is  N, 
where  N  is  the  number  of  elements  in  the  population. 

Weights  constructed  by  the  ordinary  regression  formulas  may  be  negative  for  observations  far  from  the  mean.  To 
create  weights  that  are  always  positive,  a  modified  regression  weight  generation  method  developed  by  Huang  and 
Fuller  (12)  was  applied  to  the  NFCS  data. 

Regression  weights  were  constructed  for  three  data  sets:  the  household  data,  the  day  1  intake  data,  and  the  3-day 
intake  data.  The  household  data  set  consists  of  4,495  households;  the  day  1  intake  data  set,  10,172  individuals;  and 
the  3-day  intake  data  set,  8,468  individuals. 


Household  Data 

Weight  construction — To  generate  weights,  each  of  the  categorical  variables  in  table  4  (see  chapter  2)  was 
converted  to  a  set  of  indicator  variables.  For  example,  three  variables  were  created  for  the  characteristic  household 
income  as  a  percentage  of  the  poverty  level,  where 


Zti      =  1  if  income  <  131%  for  t-th  household, 

=  0  otherwise; 

=  1  if  income  is  131-300%  for  t-th  household, 

=  0  otherwise; 

=  1  if  income  is  301  -500%  for  t-th  household, 

=  0  othen/vise. 


The  fourth  category  of  household  income  as  a  percentage  of  the  poverty  level,  >  500  percent,  was  represented  by 
setting  Z  ,i,  Z  t2,  and  Z  t3  to  zero.  In  addition  to  the  variables  of  table  4,  three  indicator  variables  were  created  for  the 
four  seasons  (table  5). 
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Table  5.-Seasonal  distribution  of  household  sample 


Season 

Sample 
frequency 

Sample 
percentage 

Target 
percentage 

Spring  (April-June) 

1,828 

40.7 

25.0 

Summer  (July-September) 

678 

15.1 

25.0 

Fall  (October-December) 

717 

16.0 

25.0 

Winter  (January-March) 

1,272 

28.3 

25.0 

Employing  this  procedure,  25  indicator  variables  were  created  for  the  household  data  set.  In  addition,  household  size 
and  the  square  of  household  size  were  used  as  continuous  variables. 

The  27  variables  were  used  to  generate  regression  weights  using  the  program  developed  by  Huang  and  Fuller  (12). 
Constant  starting  weights  were  used.  One  iteration  was  required  to  produce  a  set  of  real  weights  such  that 

n 

E  w,X,j  =  X.j 
t=  1 

for  each  of  the  control  variables,  where  w,  is  the  weight  for  the  t-th  observation,  X^  is  the  value  of  the  j-th  control 
variable  for  the  t-th  observation,  n  is  the  number  of  observations  in  the  group,  and  Xj  is  the  population  total  for  the  j-th 
control  variable.  The  weights  were  then  rounded  to  integer  weights,  where  each  weight  is  a  weight  in  thousands.  The 
sum  of  the  integer  weights  is  the  population  total  in  thousands. 

The  sum  of  the  final  weights  is  88,942,  which  is  the  number  of  households  in  the  population  in  thousands.  The 
average  weight  is  19.79,  the  smallest  weight  is  6,  and  the  largest  weight  is  47.  Thus,  the  largest  weight  is  2.38  times 
the  average  weight.  The  average  of  the  squares  of  the  weights  is  515.7.  The  square  of  the  average  weight  is  391.5. 
Thus,  if  a  variable  has  zero  multiple  correlation  with  the  27  variables,  the  variance  of  an  estimate  computed  with  the 
weights  will  be  about  (515.7/391.5  =)  1.32  times  the  variance  of  the  simple  unweighted  estimator 

Efficiency  Comparisons — To  compare  estimates  constructed  with  weights  to  unweighted  estimates,  we  use  these 
household-level  variables: 

=  adjusted  total  number  of  meals  away  from  home  (meals  away), 
Y2  =  total  money  value  of  food  used  at  home  (home  food),  and 
Y3  =  household  size  in  21 -meal-equivalent  persons  (meal-persons). 


The  household  size  in  21 -meal-equivalent  persons  is  the  total  adjusted  meals  eaten  from  household  food  supplies  in 
the  past  7  days.  "Meal  persons"  is  the  sum  of  two  terms.  The  first  term  is  the  sum  of  the  proportions  of  meals  eaten  at 
home  in  the  interview  week  by  each  household  member  The  second  term  is  the  number  of  meals  served  to  guests, 
boarders,  and  employees  during  the  interview  week,  divided  by  21.  In  other  words: 

it     +     b,  _ 
hjt  +  aj,  "ST" 
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where 

hjt  =  meals  eaten  at  home  by  the  i-th  individual  in  the  t-th  household  during  the  interview  week, 

ajj  =  meals  eaten  away  from  home  by  the  i-th  individual  in  the  t-th  household  during  the  interview  week, 

and 

b,  =  number  of  meals  eaten  by  nonhousehold  members  in  the  t-th  household  during  the  interview  week. 

The  adjusted  total  number  of  meals  eaten  away  from  home  is  the  sum  of  the  proportions  of  meals  eaten  away  from 
home  in  the  interview  week  by  household  members,  multiplied  by  21 .  in  the  notation  above, 

meals  away  for  ^    /   ^      a;;        )  ^  21 
t-th  household         ^     i     hj,  +  aj,  / 

The  total  money  value  of  food  used  at  home  is  the  expenditures  for  purchased  food  plus  the  money  value  of  home- 
produced  food  and  food  received  free  of  cost  that  was  used  during  the  survey  week.  Expenditures  for  purchased  food 
were  based  on  prices  reported  as  paid  regardless  of  the  time  of  purchase;  sales  tax  was  excluded.  Purchased  food 
with  unreported  prices,  food  produced  at  home,  food  received  as  a  gift,  and  food  received  instead  of  pay  were  valued 
at  the  average  price  per  pound  paid  for  comparable  food  by  survey  households  in  the  same  region  and  season. 

The  means  of  the  variables  computed  using  unweighted  data  are  given  in  table  6  in  the  column  headed  "Unweighted 
mean."  The  standard  errors  of  the  estimates  are  given  in  parentheses  below  the  estimates.  The  estimates  and 
standard  errors  for  the  unweighted  estimates  were  computed  in  PC  CARP  (13).  The  stratified  cluster  sample  design  of 
the  NFCS  1987-88  was  accounted  for  in  the  computations. 

Table  6.~Properties  of  alternative  estimators  for  selected  household  variables 


Relative 

Variable  Unweighted         Weighted  Difference  efficiency  of 

mean  mean  regression 


Meals  away 

8.27 

8.57 

-0.30 

2.56 

(0.22) 

(0.22) 

(0.12) 

Home  food 

59.37 

57.49 

1.88 

5.60 

(1.12) 

(0.91) 

(0.39) 

Meal-persons 

2.33 

2.22 

0.11 

129.00 

(0.03) 

(0.01) 

(0.01) 

The  variance  of  an  estimate  from  a  clustered  sample  of  households  is  generally  greater  than  the  variance  from  a 
simple  random  sample  containing  the  same  number  of  households.  The  ratio  of  these  two  variances  is  called  the 
design  effect.  Estimated  design  effects  of  the  unweighted  means  for  selected  household  variables  are  presented  in 
table  7. 

Table  7.--Design  effects  of  unweighted  means 
for  selected  household  variables 


Variable  Design  effect 


Meals  away  2.5 
Home  food  4.1 
Meal-persons  2.5 
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The  column  of  table  6  headed  "Weighted  mean"  contains  the  estimates  computed  with  the  regression  weights.  The 
standard  errors,  given  in  parentheses,  were  computed  in  PC  CARP  using  the  variance  formula  for  regression 
estimation.  The  variance  calculation  requires  computing  a  regression  for  every  variable.  The  standard  errors  for 
unweighted  and  weighted  estimates  for  meals  away  and  for  home  food  are  similar;  however,  the  standard  errors  for 
the  regression  estimate  of  the  population  mean  of  meal-persons  are  about  one-third  of  the  standard  error  of  the 
unweighted  estimate.  The  standard  error  of  the  regression  estimator  is  smaller  because  meal-persons  is  highly 
correlated  with  the  household  size  variables  used  as  controls  in  the  regression  procedure. 

The  estimated  multiple  correlation,  R^,  between  the  variables  in  the  table  and  the  27  control  variables  is  0.29,  0.44, 
and  0.82  for  meals  away,  home  food,  and  meal-persons.  If  the  sample  means  of  the  control  variables  were  nearly 
equal  to  the  population  means,  the  standard  error  of  the  regression  estimate  of  meals  away  would  be  about 
V  (1  -  r2)  =  0.84  times  the  standard  error  of  the  unweighted  estimate.  In  fact,  the  estimated  standard  error  of  the 
regression  estimate  is  about  0.97  times  the  standard  error  of  the  unweighted  estimate.  The  difference  is  due  to  the 
fact  that 
„n  2 

2j   w,    is  consistently  bigger  than  n "  because  the  sample  is  unbalanced  on  a  number  of  items. 

Table  6  also  contains  the  estimated  differences  between  the  unweighted  and  weighted  estimators  of  the  mean.  The 
difference  between  the  unweighted  and  the  weighted  estimated  mean  is 
n  n  n 

2    Y,/n-  2  a,Yt  =  2  (1/n-at)Y,, 
t  =  1  t= 1  t=  1 


where    at=      w,  Wg 

To  compute  the  variance  of  the  difference  between  the  means,  we  note  that  the  hypothesis  of  a  zero  difference  is 
equivalent  to  the  hypothesis  that  the  correlation  between  aj  and  yj  is  zero.  Therefore,  using  PC  CARP,  we  computed 
the  unweighted  regression  of  y;  on  aj  and  computed  the  variance  of  the  regression  coefficient  under  the  design.  The 
standard  errors  for  the  difference  in  table  6  are  such  that  the  "t-statistic"  for  the  hypothesis  of  zero  difference  is  equal  to 
the  "t-statistic"  for  the  coefficient  of  at  in  the  regression  of  y,  on  a,. 

For  all  three  characteristics,  the  difference  between  weighted  and  unweighted  estimators  of  the  population  mean  is 
significant  at  traditional  levels.  Thus,  under  the  assumption  that  the  regression  estimators  are  unbiased,  there  are 
significant  biases  in  the  unweighted  estimators.  We  do  not  know  that  the  regression  estimator  is  unbiased,  but  it 
seems  reasonable  to  assume  that  the  regression  adjustment  reduces  the  bias  in  the  estimators  of  the  population 
mean. 

The  last  column  of  table  6  contains  the  ratio  of  the  estimated  mean  square  error  of  the  unweighted  estimator  to  the 
variance  of  the  regression  estimator.  The  estimated  mean  square  errors  for  the  unweighted  estimators  were 
computed  as 

A  A 

MSEu  =  V  -H  max  {0,  (Diff)^  -  (s.e.  diff)^} 

A 

where  V  is  the  estimated  variance  of  the  unweighted  estimate,  Dift  is  the  difference  between  the  two  estimates  from 
table  6,  and  s.e.  diff  is  the  standard  error  of  the  difference  from  table  6.  The  second  term  of  the  estimated  mean 
square  error  is  the  estimated  squared  bias.  The  estimated  mean  square  errors  of  the  weighted  estimators  are  the 
variances  of  the  weighted  estimators  computed  as  the  squares  of  the  standard  errors  of  table  6. 

Of  the  three  characteristics  for  which  the  population  mean  was  estimated,  the  estimated  relative  efficiency  of  the 
regression  estimator  to  the  simple  mean  ranges  from  2.5  to  129.  The  regression  estimator  for  meals  away  has  the 
smallest  estimated  efficiency.  The  variances  of  the  two  estimators  are  similar,  but  because  of  the  estimated  bias,  the 
regression  estimate  for  meals  away  is  estimated  to  have  a  mean  square  error  that  is  about  (1/2.5  =)  40  percent  of  that 


of  the  unweighted  estimate.  The  mean  square  error  of  the  regression  estimate  for  home  food  is  less  than  20  percent 
of  that  of  the  unweighted  estimate,  and  that  for  meal  persons  is  about  1  percent  of  that  of  the  unweighted  estimate.  In 
all  cases,  the  squared  bias  is  a  very  important  component  of  the  estimated  mean  square  error. 

Even  after  allowing  for  the  fact  that  the  population  totals  from  the  Current  Population  Survey  are  not  known  population 
totals,  it  is  clear  that,  for  these  items,  large  gains  in  accuracy  are  associated  with  regression  estimation  for  the 
population  means. 


Individual  Data 

Weight  construction — The  data  set  for  individuals  providing  day  1  dietary  intakes  consists  of  10,172  persons.  The 
8,468  persons  providing  3-day  dietary  intakes  are  a  subset  of  the  10,172  individuals  who  provided  1-day  intakes. 

For  both  individual  data  sets,  weights  were  constructed  separately  for  each  of  three  sex-age  groups;  namely,  men  age 
20  and  over,  women  age  20  and  over,  and  persons  under  20  years  old.  There  are  3,1 58  observations  for  the  men, 
3,967  observations  for  the  women,  and  3,047  observations  for  persons  less  than  20  years  old  in  the  day  1  data  set. 
There  are  2,619  men,  3,293  women,  and  2,556  persons  under  age  20  in  the  3-day  data  set. 

The  13  characteristics  in  table  3  (see  chapter  2)  were  converted  to  indicator  variables  that  could  be  used  in  a 
regression  analysis.  Using  this  procedure  on  the  13  characteristics  resulted  in  20  control  variables  for  the  men,  20  for 
the  women,  and  19  for  those  under  20  years  old  (the  latter  group  having  one  less  age  category). 

in  addition,  control  variables  were  created  for  day-of-observation  and  month-of-observation  by  race  (black,  nonblack) 
for  each  of  the  three  sex-age  categories.  Twelve  control  variables  were  created  for  the  day  effects  (6  for  nonblack  and 
6  for  black)  and  22  were  created  for  the  month  effects  (11  for  nonblack  and  11  for  black)  for  each  sex-age  group.  In  all, 
there  were  54  control  variables  each  for  the  men  and  women  and  53  control  variables  for  those  less  than  age  20.  The 
population  totals  for  the  day  and  month  effects  were  calculated  by  dividing  the  population  total  for  each  race  by  7  and 
12  for  each  sex-age  group. 

The  weights  were  greatly  influenced  by  the  distribution  of  observations  over  day  of  the  week  and  month.  For  the  day  1 
sample,  the  number  of  Saturday  observations  is  well  below  that  expected  under  an  even  distribution  of  observations 
over  day  of  the  week  (table  8).  Overall,  the  sample  contained  4.7  percent  Saturday  observations,  whereas  14.3 
percent  was  expected.  Black  men  had  the  lowest  fraction  of  Saturday  observations,  3.5  percent.  The  uneven 
distribution  of  observations  over  day  of  the  week  can  be  explained  by  the  lack  of  interviewers  working  on  Sundays.  In 
a  Sunday  interview,  the  first  day  of  observation  is  the  Saturday  information  collected  by  recall. 

There  was  also  an  uneven  distribution  of  observations  over  months,  which  was  partly  due  to  the  fact  that  the  data  were 
collected  over  a  17-month  period.  Nearly  70  percent  of  the  observations  for  the  day  1  sample  were  taken  during  the  6 
months  of  January  through  June;  over  half  of  the  observations  were  obtained  during  the  4  months  of  f\/Iarch  through 
June.  The  distribution  of  observations  over  months  was  similar  for  the  three  sex-age  groups. 

The  weights  for  the  3-day  sample  were  influenced  by  the  distribution  of  observations  over  day  of  the  week  and  month 
in  the  same  fashion  as  the  day  1  sample  weights.  The  day  of  the  week  on  which  the  first  day  of  the  3-day  observation 
period  was  conducted  was  used  as  the  control  variable  in  constructing  weights  for  the  three-day  sample. 

The  weight  program  was  applied  separately  to  each  of  the  three  sex-age  groups.  Constant  starting  weights  were  used 
for  the  weight  generation  program  for  each  group.  The  iterations  within  the  program  are  designed  to  produce  weights 
which  are  all  nonnegative  and  such  that  the  largest  weights  are  not  overly  large  relative  to  the  average  weight.  The 
program  then  rounds  the  real  weights  to  integer  weights,  so  that  the  sum  of  the  integer  weights  is  the  population  total 
in  thousands.  Iteration  is  used  to  construct  integer  weights  such  that  the  maximum  deviation  between  the  estimated 
and  actual  population  totals  was  five  (that  is,  5,000). 

The  mean  weights  are  25.1 ,  22.3,  and  23.4  for  men,  women,  and  persons  under  age  20.  The  largest  weight  for  males 
is  5.18  times  the  mean  weight  for  men.  The  analogous  ratios  for  women  and  persons  under  20  are  3.50  and  5.81.  The 
weights  range  from  1  to  136  for  the  10,172  observations.  Two  individuals  had  a  weight  of  136  and  11  had  a  weight  of 
1 .  The  ranges  of  the  weights  are  from  1  to  130  for  men,  from  1  to  78  for  women,  and  from  1  to  136  for  persons  under 
20.  The  ratios  of  the  mean  of  the  squared  weights  to  the  mean  of  the  weights  squared  are  2.50,  2.13,  and  2.42  for 
men,  women,  and  persons  under  age  20. 
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Table  8.~Temporal  distributions  for  day  1  and  3-day  samples 


Day  1  3-day 
sample  sample  Target 

Characteristic  percentage  percentage  percentage 


Day  of  week: 
Sunday 
Monday 
Tuesday 
Wednesday 
Thursday 
Friday 
Saturday 


(a)  Men  age  20  and  over 


20.3 
17.5 
17.6 
15.3 
13.3 
11.6 
4.5 


20.6 
17.1 
17.2 
15.7 
13.5 
11.5 
4.4 


14.3 
14.3 
14.3 
14.3 
14.3 
14.3 
14.2 


Month: 
January 
February 
March 
April 
May 
June 
July 
August 
September 
October 
November 
December 


Day  of  week: 
Sunday 
Monday 
Tuesday 
Wednesday 
Thursday 
Friday 
Saturday 


8.2 
8.3 
12.2 
11.0 
17.5 
12.8 
3.6 
3.4 
6.7 
7.0 
3.2 
6.1 


8.3 
8.9 
11.7 
10.8 
17.5 
12.3 
3.1 
3.5 
7.5 
6.8 
3.3 
6.2 


(b)  Women  age  20  and  over 


20.0 
18.1 
17.0 
15.6 
13.0 
11.9 
4.4 


20.3 
18.3 
16.9 
15.9 
12.7 
11.8 
4.2 


8.3 
8.4 
8.3 
8.3 
8.3 
8.4 
8.3 
8.4 
8.3 
8.3 
8.4 
8.3 


14.3 
14.3 
14.3 
14.3 
14.3 
14.3 
14.2 


Month: 
January 
February 
March 
April 
May 
June 


8.8 
8.2 
11.6 
10.5 
17.5 
12.9 


8.9 
8.6 
11.4 

10.3 
17.3 
12.3 


8.3 
8.4 
8.3 
8.3 
8.3 
8.4 


Table  8.-Temporal  distributions  for  day  1  and  3-day  samples — continued 


Day  1 

3-day 

sample 

sample 

Target 

Characteristic 

percentage 

percentage 

percentage 

(b)  Women  age  20  and  over 

July 

4.3 

3.5 

8.3 

August 

3.6 

3.6 

8.4 

September 

7.2 

7.0 

8.3 

October 

6.6 

7.0 

8.3 

November 

3.1 

3.2 

8.4 

December 

5.8 

6.0 

8.3 

(c)  Persons  under  20  years  old 

Day  of  week: 

Sunday 

18.6 

18.2 

14.3 

Monday 

17.7 

17.9 

14.3 

Tuesday 

17.9 

18.1 

14.3 

Wednesday 

15.1 

15.7 

14. 

Thursday 

13.6 

13.7 

14.3 

Friday 

11.8 

11.3 

14.3 

Saturday 

5.2 

5.2 

14.2 

Month: 

January 

9.5 

9.6 

8.3 

February 

8.7 

9.2 

8.4 

March 

11.7 

11.7 

8.3 

April 

11.8 

11.2 

8.3 

May 

16.1 

16.4 

8.3 

June 

12.1 

11.3 

8.4 

July 

4.0 

3.4 

8.3 

August 

3.2 

2.9 

8.4 

September 

7.6 

8.4 

8.3 

October 

6.3 

6.4 

8.3 

November 

3.4 

3.6 

8.4 

December 

5.6 

5.8 

8.3 

A  procedure  similar  to  that  used  on  the  day  1  data  set  was  used  to  find  regression  weights  for  the  3-day  data  set.  Of 
the  8,468  subjects  in  the  3-day  sample,  2,619  were  men  age  20  and  over,  3,293  were  women  age  20  and  over,  and 
2,556  were  persons  under  age  20.  The  same  54  control  variables  were  used  on  each  of  the  three  sex-age  groups  in 
constructing  weights  via  the  weight  generation  program.  Rather  than  using  constant  starting  weights,  however,  the 
final  weights  found  for  the  day  1  survey  subjects  who  participated  in  the  3-day  study  were  used  as  starting  weights  for 
the  3-day  sample. 

The  means  of  the  weights  for  the  3-day  data  are  30.3,  26.8,  and  27.9  for  men,  women,  and  persons  under  age  20.  The 
means  of  the  squares  of  the  weights  are  2.54,  2.19,  and  2.71  times  the  mean  weight  squared  for  men,  women,  and 
persons  under  20.  These  ratios  are  slightly  larger  than  those  for  the  day  1  weights.  A  slight  increase  would  be 
anticipated  because  the  same  number  of  control  variables  are  being  used  on  a  smaller  sample.  Each  control  variable 
imposes  a  restriction  on  the  weights.  The  weights  for  the  3-day  data  range  from  1  to  231  for  men,  1  to  142  for  women, 


20 


and  1  to  259  for  persons  under  20.  The  largest  weights  are  7.62,  5.30,  and  9.28  times  the  mean  weight  for  men, 
women,  and  persons  under  age  20.  These  ratios  are  also  larger  than  the  corresponding  ratios  in  the  day  1  data  set  for 
the  same  reason. 

Efficiency  comparisons — The  day  1  sample  was  used  to  compare  the  efficiency  of  the  regression  estimator  with  that 
of  the  simple  estimator.  The  following  variables  were  used  in  the  comparison: 

Y-i  =  indicator  to  identify  pregnant/lactating  women  (nurse) 

Y2  =  food  energy  intake  as  percentage  of  Recommended  Dietary  Allowance  (%RDA) 

Y3  =  total  fluid  milk  intake  (milk) 

Y4  =  total  food  energy  intake  (energy) 

Y5  =  away  from  home  food  energy  (energy  out) 

The  means  and  standard  errors  for  the  unweighted  estimates  were  computed  in  PC  CARP,  recognizing  that  the 
sample  is  a  stratified  cluster  sample.  As  was  the  case  with  the  household  data,  the  estimated  variances  are  larger 
than  the  variance  of  a  simple  random  sample  containing  the  same  number  of  individuals.  This  is  due  to  the 
correlations  among  elements  within  clusters.  The  estimated  design  effects  are  shown  in  table  9. 


Table  9.-Design  effects  of  unweighted  means  for  selected  individual  variables 


Variable 

Men  age  20 
and  over 

Women  age  20 
and  over 

Persons  under 
age  20 

Nurse 

NA 

1.3 

0.9 

%RDA 

2.7 

1.7 

2.5 

Milk 

2.0 

1.5 

2.2 

Energy 

2.8 

1.8 

2.5 

Energy  out 

1.8 

1.7 

3.0 

The  means  of  the  variables  computed  using  unweighted  data  are  given  in  table  10  in  the  column  headed  "Unweighted 
mean."  Means  for  the  five  variables  are  given  for  each  of  the  three  sex-age  groups:  men,  women,  and  persons  under 
age  20.  The  standard  errors  are  given  in  parentheses  below  the  estimates.  In  table  10,  estimates  computed  using  the 
regression  weights  are  given  in  the  column  headed  "Weighted  mean."  The  standard  errors  were  computed  in  PC 
CARP  using  the  formula  for  the  regression  estimator.  The  variance  calculation  requires  computing  a  regression  for 
each  variable.  The  standard  errors  of  the  weighted  and  unweighted  estimates  for  these  individual  characteristics 
generally  differ  more  than  did  the  standard  errors  for  the  household  characteristics,  and  in  every  case  the  standard 
error  of  the  weighted  estimate  exceeds  that  of  the  unweighted  estimate.  These  characteristics  were  not  highly 
correlated  with  the  variables  used  as  controls  in  the  regression  procedure. 


Table  10.~Properties  of  alternative  estimators  for  selected  individual  variables 


Relative 

Unweighted  Weighted  efficiency  of 

mean  mean  Difference  regression 


Men  age  20  and  over 


%RDA 

78.57 

79.80 

-1.23 

0.96 

(1.00) 

(1.29) 

(0.95) 

Milk 

192.07 

201 .56 

-9.49 

.87 

(7.73) 

(10.48)  ■ 

(7.36) 

Energy 

2,100.00 

2,153.97 

-53.97 

2.37 

(27.59) 

(35.52) 

(26.27) 

Energy  out 

471.36 

570.44 

-99.08 

8.52 

(17.86) 

(33.52) 

(23.70) 

Women  age  20  and  over 

Nurse 

3.58 

3.90 

-0.32 

.72 

(0.34) 

(0.45) 

(0.27) 

%RDA 

71.43 

71.25 

0.18 

.43 

(0.63) 

(0.96) 

(0.59) 

iVIilk 

153.32 

147.77 

5.56 

1.30 

(4.25) 

(5.37) 

(3.36) 

Energy 

1,493.05 

1,497.01 

-3.96 

.44 

(13.58) 

(20.38) 

(12.38) 

Energy  out 

303.95 

339.04 

-35.09 

7.53 

(10.61) 

(12.95) 

(8.95) 

Persons  under  20  years  old 

%RDA 

85.20 

85.27 

-0.07 

.72 

(0.97) 

(1.14) 

(0.96) 

Milk 

343.42 

340.89 

2.52 

.47 

(8.76) 

(12.77) 

(11.34) 

Energy 

1,669.44 

1,707.97 

-38.53 

2.04 

(22.95) 

(26.40) 

(24.32) 

Energy  out 

411.53 

441.56 

-30.03 

1.86 

(18.68) 

(20.97) 

(20.83) 
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The  estimated  multiple  correlations  between  the  variables  in  table  10  and  the  control  variables  ranged  from  0.03 
(%RDA  for  women)  to  0.26  (energy  for  persons  under  age  20).  If  the  sample  means  and  the  population  means  for  the 
control  variables  had  been  nearly  equal,  the  variance  of  the  regression  estimate  of  %RDA  for  women  would  have 
been  about  (1  -  0.03)  =  0.97  times  the  variance  of  the  unweighted  estimate.  For  energy  for  persons  under  20,  the 
variance  of  the  regression  estimate  would  have  been  about  (1  -  0.26)  =  0.74  times  the  variance  of  the  unweighted 
estimate.  Actually,  the  estimated  variance  of  the  regression  estimate  is  about  2.79  times  the  variance  of  the 
unweighted  estimate  for  %RDA  for  women  and  about  1 .32  times  the  variance  of  the  unweighted  estimate  for  energy 
for  persons  under  age  20.  The  difference  is  due  to  the  fact  that  the  sample  is  unbalanced  on  a  number  of  items,  so 

that  L,  Wj  /  n  is  considerably  greater  than  w (2.14  times  as  large  for  women,  2.49  times  for  men,  and  2.48  times 
for  persons  under  age  20). 

Significant  differences  between  the  weighted  and  unweighted  estimated  means  were  found  for  energy  out  for  women, 
and  for  energy  and  energy  out  for  men.  In  addition,  as  is  shown  in  the  last  column  of  table  1 0,  the  estimated  relative 
efficiency  of  the  regression  estimator  was  greater  than  1  for  milk  for  women,  and  for  energy  and  energy  out  for  persons 
age  20  and  under. 

As  was  the  case  with  the  household  data,  substantial  gains  can  be  achieved  with  regression  estimation  for  the 
population  means.  But  gains  are  not  assured,  as  table  10  illustrates.  Losses  in  efficiency,  when  they  do  occur,  are 
generally  small  relative  to  the  gains  in  efficiency  for  other  variables.  The  estimated  low  efficiency  for  the  regression 
estimator  in  the  cases  of  %RDA  and  energy  for  women,  and  milk  for  persons  under  age  20  are  a  result  of  the  relatively 
small  estimated  biases  in  the  unweighted  estimates. 


Chapter  4:  Comparison  of  Results  With  Other  Surveys 
P.  Peter  Basiotis,  Human  Nutrition  Information  Service,  and 
Milton  R.  Goldsamt,  National  Agricultural  Statistics  Service 


Chapter  2  showed  that  the  original  NFCS  1987-88  respondent  sample  was  unbalanced  with  respect  to  a  number  of 
demographic  characteristics.  The  weights  described  in  chapter  3  were  designed  to  make  the  weighted  estimates  of 
the  control  characteristics  equal  to  the  known  control  totals.  If  the  control  characteristics  are  correlated  with  the  items 
of  interest,  the  weighting  will  reduce  the  bias  associated  with  the  original  nonresponse.  One  cannot  make  a  direct 
evaluation  of  the  remaining  potential  nonresponse  bias;  however,  one  can  obtain  some  indirect  information  by 
comparing  the  weighted  estimates  from  NFCS  1987-88  with  estimates  from  other  surveys  having  higher  response 
rates  and  sampling  the  same  (or  similar)  target  population.  Three  types  of  surveys  were  considered: 

(1 )  contemporaneous  surveys  that  contain  similar  or  identical  sociodemographic 
variables  but  have  no  information  on  food  intake, 

(2)  contemporaneous  surveys  that  contain  similar  or  identical  sociodemographic 
and  health-related  variables  and  limited  information  on  food  intake,  and 

(3)  past  surveys  that  contain  sociodemographic  and  food  intake  variables  that 
are  identical  or  very  similar  to  those  in  NFCS  1987-88. 

A  survey  of  the  first  type  is  the  1987  Current  Population  Survey  (CPS),  which  was  used  to  determine  population 
characteristics,  compare  them  with  the  unweighted  NFCS  sample  (chapter  2),  and  construct  the  weights  (chapter  3). 

A  survey  of  the  second  type  is  the  1987  National  Health  Interview  Survey  (NHIS),  Cancer  Risk  Factor  Supplement, 
Epidemiology  Study,  conducted  by  the  U.S.  Bureau  of  the  Census  for  the  National  Center  for  Health  Statistics,  DHHS 
(14).  The  NHIS  contains  several  sociodemographic  and  health-related  variables  that  are  identical  or  very  similar  to 
those  of  the  NFCS.  In  addition,  the  NHIS  contains  a  few  frequency-of-food-intake  variables  that  are  similar  to  the 
NFCS  food-frequency  variables. 

Three  surveys  of  the  third  kind,  all  conducted  by  the  U.S.  Department  of  Agriculture,  are  available.  These  are- 

•  the  1985  Continuing  Survey  of  Food  Intake  by  Individuals  (CSFII  1985),  which  included  the 
collection  of  dietary  and  other  data  on  women  1 9  to  50  years  of  age,  their  children  1  to  5  years 
of  age,  and  men  1 9  to  50  years  of  age; 

•  the  CSFI1 1 986,  which  was  similar  to  the  1 985  survey  except  that  it  did  not  include  men;  and 

•  the  1977-78  Nationwide  Food  Consumption  Survey  (NFCS  1977-78). 


Current  Population  Survey 

The  estimates  of  population  distributions  of  six  demographic  variables  not  used  as  controls  when  creating  the  NFCS 
weights  were  compared  using  the  1987  CPS  and  NFCS  1987-88  weighted  day-1  data  for  individuals.  The  variables 
were  examined  for  how  similar  the  NFCS  estimates  were  to  the  CPS  estimates  overall  (table  11 ),  and  for  three 
subgroups:  men  20  years  of  age  and  over,  women  20  years  of  age  and  over,  and  persons  under  20  years  of  age. 

The  variables  compared  were- 

•  Persons  living  in  households  with  a  given  education  level  of  the  male  head  of 
household  (highest  grade  completed) 

•  Persons  living  in  households  with  a  given  education  level  of  the  female  head  of 

household  (highest  grade  completed) 

•  Geographic  location  of  household  (Census  geographic  division) 

•  Household  size 
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•  Ethnic  origin  (Spanish/Hispanic  or  not;  or  not  reported) 

•  Race  of  individual,  including  "other,"  a  category  not  used  in  the  weighting. 

Results  indicated  that  the  weighting  strategy  used  for  the  NFCS  was  effective  in  making  the  individual  intake  data 
representative  of  other  population  characteristics.  For  three  variables-education  level  of  the  male  head,  household 
size,  and  race--NFCS  and  CPS  percentages  were  quite  similar,  with  the  majority  of  categories  only  differing  by  no 
more  than  2  percentage  points  (table  11 ).  For  the  other  three  variables  there  were  a  few  differences,  but  these  tended 
to  be  limited  to  certain  categories  of  those  variables--13  years  or  more  education  for  persons  living  in  households  with 
a  female  head  of  household.  Mountain  and  Pacific  States  where  the  differences  in  one  direction  offset  those  in  the 
other,  and  Hispanic  origin,  where  differences  of  about  5  percent  may  be  due  to  differences  in  wording.  CPS 
respondents  selected  their  Spanish  ethnic  origin  from  a  flash  card  listing  eight  categories  (Mexican,  Cuban,  etc.). 
NFCS  household  respondents  were  asked  by  the  interviewer  if  anyone  in  the  household  was  of  Hispanic  origin  or 
descent  (a  two-choice  answer). 


Table  11. --Comparison  of  distributions  of  population  characteristics  of  individuals  from  weighted  NFCS 
and  the  CPS 


CPS 


Difference  | 


-percent- 


0.4 
9.0 
9.0 
29.5 
33.5 
18.6 


0.3 
1.4 
1.3 
1.1 
1.3 
.2 


.4 
8  .8 
11.4 
40.5 
31.0 
8.0 


.3 
1.6 
1.0 
1.8 
6.6 
2.1 


5.3 
15.5 
17.4 

7.3 
17.0 

6.3 
11.0 

5.4 
14.9 


.6 
.6 
.2 
.2 
.9 
1.9 
.9 
3.6 
3.7 


9.0 
24.2 
20.5 
23.6 
22.8 


.1 
.2 
.7 
.6 
1.8 


Characteristic 


NFCS 


Education  level 
of  male  head: 

None 

1  -8  years 

9-11  years 

1 2  years 

1 3+  years 

No  male  head 


0.1 
7.6 
10.3 
28.4 
34.8 
18.8 


Education  level 
of  female  head: 

None 

1  -8  years 

9-11  years 

12  years 

13+  years 

No  female  head 


.1 
7.2 
10.4 
38.7 
37.6 
5.9 


Geographic  division: 
New  England 
Middle  Atlantic 
East  North  Central 
West  North  Central 
South  Atlantic 
East  South  Central 
West  South  Central 
Mountain 
Pacific 

Household  size: 

1  person 

2  persons 

3  persons 

4  persons 
5+  persons 


5.9 
14.9 
17.6 

7.1 
17.9 

4.4 
11.9 

9.0 
11.2 


9.1 
24.4 
21.2 
24.2 
21.0 


Continued 


Table  11. -Comparison  of  distributions  of  population  characteristics  of  individuals  from  weighted  NFCS  and 
the  CPS — continued. 


Characteristic 


NFCS 


CPS 


Difference 


-percent- 


Ethnicitv: 
Hispanic 
Non-Hispanic 
Not  reported 


4.3 
95.4 
.3 


7.8 
90.5 
1.7 


3.5 
4.9 
1.4 


Race: 
White 
Black 
Other 


82.9 
12.2 
4.9 


84.7 
12.2 
3.1 


1.8 
.0 
1.8 


Overall,  the  scattered  differences  do  not  seem  widespread  enough  to  suggest  a  systematic  pattern  covering  all 
variables  investigated  or  affecting  ail  comparisons  made  for  a  particular  subgroup  of  sampled  persons. 


National  Health  Interview  Survey 

The  1987  NHIS  Cancer  Risk  Factor  Supplement  had  an  overall  response  rate  of  about  82  percent  (15).  The 
supplement  design  incorporated  a  split  sample;  about  half  of  the  44,123  repondents  were  included  in  the  Epidemiology 
Study.  Because  of  the  higher  response  rate  in  the  1 987  NHIS  study  nonresponse  bias  is  considerably  less  likely  than 
in  NFCS  1987-88.  Some  questions  on  sociodemographic  and  health  variables  were  asked  in  an  identical,  or  very 
similar,  manner  in  both  surveys,  making  some  direct  comparisons  possible.  Food  frequency  data  available  from  the 
two  surveys  were  also  examined.  Population  estimates  using  weighted  data  from  NFCS  and  NHIS  were  compared. 

Nine  sociodemographic  variables  were  considered  similar  enough  in  the  two  surveys  for  comparison:  region, 
urbanization,  age,  race,  ethnic  origin,  household  income,  education  of  the  household  head,  employment  status,  and 
living  alone.  The  estimated  percentage  of  the  population  that  was  living  in  each  of  the  four  census  regions  was  nearly 
identical-about  one-fifth  of  the  sample  in  each  survey  lived  in  the  Northeast,  another  one-fifth  in  the  West,  one-fourth 
in  the  Midwest,  and  one-third  in  the  South.  The  estimated  mean  levels  of  several  characteristics  were  in  fairly  close 
agreement  between  the  two  surveys:  age  (figure  1),  employment  status  (figure  2),  household  income  expressed  as  a 
percentage  of  the  Federal  poverty  level,  and  education  level  of  the  household  head.  For  other  characteristics,  there 
was  less  agreement  between  the  estimated  means  from  the  two  surveys.  Fewer  men  (figure  3)  and  women  18  years 
old  and  over  lived  in  central  cities,  and  more  lived  in  suburban  and  nonmetropolitan  locations,  according  to  NFCS 
estimates  compared  with  NHIS  estimates.  NHIS  estimates  showed  more  individuals  living  alone  (figure  4)  than  did 
NFCS;  the  NFCS  estimates  are  closer  to  the  CPS  population  estimates  than  the  NHIS  estimates  are.  According  to 
both  NFCS  and  NHIS,  blacks  made  up  about  the  same  proportions  of  the  total  population  (10  percent  for  men  and  12 
percent  for  women).  For  the  total  population  and  in  all  regions,  the  NFCS  apparently  underestimated  the  proportion  of 
individuals  of  Hispanic  origin;  the  NHIS  estimates  were  close  to  the  CPS  estimates. 

The  estimated  mean  levels  for  self-reported  height  (figure  5),  weight  (figure  6),  and  body  mass  index  were  nearly 
identical  in  NFCS  1987-88  and  the  NHIS  1987,  as  were  the  percentages  of  individuals  reporting  health  status  as  good, 
very  good,  or  excellent.  However,  the  NFCS  estimate  shows  a  smaller  percentage  of  men  and  women  taking  vitamin 
and  mineral  supplements  than  does  the  NHIS  (figure  7).  Also,  the  NFCS  estimates  showed  lower  percentages  of  men 
and  women  who  have  quit  smoking  (figure  8),  but  higher  percentages  that  never  smoked. 

In  NFCS  1987-88,  participants  were  asked  questions  pertaining  to  frequency  of  consumption  of  11  calcium-rich  foods 
over  the  last  3  months.  In  NHIS  1 987,  participants  were  administered  a  food  frequency  questionnaire  containing 
approximately  60  food  items.  Participants  were  asked  about  their  frequency  of  consumption  over  the  past  year  To 
make  the  responses  as  comparable  as  possible,  the  NFCS  results  were  multiplied  by  four  to  give  an  estimate  of 
frequency  of  consumption  over  the  past  year. 
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Six  food  frequency  questions  were  similar  enougli  across  the  two  surveys  to  allow  comparison.  There  was  close 
agreement  only  for  women's  consumption  of  cheese,  although  estimated  cheese  intake  by  women  in  the  West  was 
higher  in  the  NFCS  1 987-88  than  in  the  NHIS  1 987  (figure  9).  Milk  as  a  beverage,  milk  in  coffeeAea,  ice  cream  (figure 
10),  and  dry  beans  (figure  11)  showed  less  agreement  for  both  men  and  women.  Consumption  of  dark-green  leafy 
vegetables  was  dramatically  different  (figure  12).  As  in  the  case  of  the  sociodemographic  and  health  characteristics, 
most  of  the  differences  in  the  average  food  frequency  levels  were  in  the  same  direction  across  sex  and  region 
classifications. 

Food  frequency  differences  may  be  attributable  to  methodological  differences  between  the  two  surveys,  including  the 
different  lengths  of  the  recall  periods  (past  3  months  in  NFCS  versus  the  past  year  in  NHIS)  and  differences  in  the 
wording  and  context  of  the  questions.  For  example,  the  context  of  the  frequency  of  consumption  of  dark-green  leafy 
vegetables  question  differed  in  the  two  surveys.  In  NFCS,  this  question  was  not  preceded  by  any  salad  or  vegetable 
questions,  but  in  NHIS  it  followed  the  question  about  salad  consumption.  It  seems  likely  that  some  NHIS  respondents 
may  have  included  dark-green  leafy  vegetable  consumption  in  their  answers  to  the  salad  question,  or  some  NFCS 
respondents  may  have  included  lettuce  in  their  dark-green  leafy  vegetable  answer.  Either  would  have  contributed  to 
the  large  differences  in  the  estimated  mean  frequencies  of  dark-green  leafy  vegetable  consumption  between  the  two 
surveys. 

Because  of  these  methodological  differences,  rt  is  difficult  to  judge  whether  the  NFCS  variables  that  had  different 
estimated  mean  levels  from  those  of  the  NHIS  were  subject  to  nonresponse  bias.  On  the  other  hand,  it  is  reassuring 
that  several  of  the  NFCS  and  NHIS  variables  had  nearly  identical  estimated  mean  levels  for  the  population. 


Previous  USDA  Surveys:  1977-78, 1985,  and  1986 

The  nonresponse  evaluation  included  a  comparison  of  food  energy  from  NFCS  1987-88  with  previous  surveys 
conducted  by  USDA-the  1977-78  Nationwide  Food  Consumption  Survey  and  the  Continuing  Surveys  of  Food  Intakes 
by  Individuals  conducted  in  1985  and  1986  (figure  13).  Although  these  surveys  had  different  methodologies,  designs, 
and  target  samples,  all  four  surveys  included  a  24-hour  recall  of  dietary  intake  in  April  and  May;  the  comparisons  were 
limited  to  these  data.  In  addition,  since  the  CSFII  1985  and  1986  targeted  only  specific  sex-age  groups,  the 
comparisons  were  limited  to  children  1  to  5  years  old  and  women  19  to  50  years  old. 

Many  factors  may  have  contributed  to  the  differences  between  estimated  food  energy  for  1977-78,  1985,  1986,  and 
1987-88,  including  (1)  true  differences  in  population  intakes,  (2)  sampling  error,  (3)  differences  in  the  weighting 
procedures  used,  (4)  differences  in  respondent  burden  caused  by  the  presence  or  absence  of  the  household 
component  of  the  survey,  (5)  artifactual  changes  in  the  food  composition  data  base  resulting  from  improvements  in 
food  sampling  and  analytical  techniques  and  larger  sample  sizes  (16),  and  (6)  nonresponse  bias. 

The  purpose  of  the  analysis  was  to  seek  evidence  of  nonresponse  bias;  however,  the  differences  that  were  found 
appear  to  have  been  caused  by  the  differences  in  methodology,  design,  and  target  samples  rather  than  by 
nonresponse.  The  estimates  from  the  two  Nationwide  Food  Consumption  Surveys  were  generally  more  similar  to 
each  other  than  they  were  to  estimates  from  the  two  CSFH's.  These  estimates  most  likely  reflect  the  differences  in 
respondent  burden  between  NFCS  and  CSFII. 


Figures  1-4.--Comparisons  of  selected  sociodemographic  variables  in  NFCS  1987-88  and  NHIS  1987 
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Figures  5-8.--Comparison  of  health  variables  in  MFCS  1987-88  and  NHIS  1987 
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Figures  9-12.--Comparisons  of  selected  food  frequency  variables  in  NFCS  1987-88  and  NHIS  1987 
(nunnber  of  times  consumed  per  year),  individuals  18+  years  of  age 


CHEESE 


ICE  CREAM 


200  —I 


150  — 


100 
50 


Figure  9.  l. 


MEN 


WOMEN 


MEN 


WOMEN 


NFCS  1987-88 
NHIS  1987 


DRY  BEANS 


DARK-GREEN  LEAFY  VEGETABLES 


200 
150 
100  • 
50  _ 
0  — 


Figure  11. 


41 


47 


200 

150  —I 
100  • 
50  . 
0  ■ 


Figure  12. 


MEN 


WOMEN 


MEN 


WOMEN 


30 


Figurel 3. "Estimated  food  energy  intakes  from  four  USDA  surveys,  1  day,  April  and  May 
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Chapter  5:  Intrasurvey  Comparison 

Rhonda  S.  Sebastian,  Human  Nutrition  Information  Service 


Households  drawn  into  the  MFCS  1987-88  sample  participated  to  varying  degrees.  Some  households  completed  all 
three  components:  the  household  survey,  the  1-day  recall,  and  the  2-day  record.  However,  a  large  number  of 
households  that  participated  in  the  household  component  had  individual  household  members  who  declined  further 
participation  in  the  individual  intake  components  or  provided  only  1  day  of  recall  data.  The  purpose  of  this  study  is  to 
determine  if  there  was  an  association  between  level  of  household  participation  and  dietary  quality.  Can  the 
conclusions  regarding  individual  intake  of  the  full  participants  be  generalized  to  those  respondents  in  households  that 
discontinued  participation  at  an  earlier  point?  Alternatively,  is  there  bias  with  regard  to  level  of  participation? 

Investigating  the  effects  of  the  level  of  participation  within  a  given  sample  falls  into  a  general  class  of  methods  of 
nonresponse  analysis  that  incorporate  the  use  of  a  nonparticipating  internal  criterion  group  to  compare  to  full 
participants  to  determine  if  the  groups  are  similar  (17,  18,  19,  20,  21).  The  sources  of  this  criterion  group  are  diverse, 
but  the  basic  requirements  are  (1 )  the  data  from  this  group  must  be  obtained  in  the  same  manner  as  the  data  from  the 
full  participants  and  (2)  there  must  be  a  justification  of  why  this  criterion  group  may  represent  nonrespondents. 

The  households  that  dropped  out  before  the  completion  of  all  survey  components  fulfilled  these  requirements.  The 
information  from  them  was  obtained  using  the  same  methodology,  design,  and  target  sample  as  the  full  participants. 
This  is  an  important  quality  since  differences  in  these  factors  between  NFCS  1987-88  and  other  surveys  used  for 
comparison  were  a  confounding  limitation  common  to  the  other  nonresponse  studies  included  in  this  report. 

There  is  also  reason  to  believe  that  this  nonparticipating  group  may  be  more  similar  to  survey  nonrespondents  than 
are  the  fully  cooperating  participants.  The  universal  characteristic  shared  by  this  group  and  nonrespondents  was  their 
propensity  to  refuse  participation,  even  if  it  was  at  different  points  in  the  survey  administration.  Refusal  was  a  major 
source  of  complete  nonresponse  in  the  NFCS  1987-88  (table  1),  and  it  was  also  the  primary  reason  cited  for 
discontinued  participation  (22);  therefore,  the  households  that  dropped  out  before  the  completion  of  all  survey 
components  can  be  used  to  detect  nonresponse  bias  with  regard  to  level  of  participation.  Also,  there  is  potential  for 
extending  their  use  as  proxies  for  total  nonrespondents,  at  least  with  regard  to  refusals,  although  possibly  not  for 
noncontacts.  There  is  reasonable  support  in  the  literature  to  the  effect  that  proxies  obtained  by  various  means 
adequately  represent  refusals.  However,  noncontacts  have  been  shown  to  be  distinctly  different  in  many  relevant 
characteristics  from  participants  and  from  refusals  as  well  (21 ,  23,  24,  25,  26,  27),  and  noncontacts  were  high  in  this 
survey.  For  this  reason,  extrapolation  of  results  to  all  nonrespondents  is  not  advisable. 

The  research  included  the  following  steps:  First,  since  the  household  was  the  unit  of  analysis,  individuals  were 
classified  at  the  household  level;  and  the  validity  of  that  classification  was  determined.  Second,  a  preliminary  analysis 
was  performed  to  detect  differences  among  the  groups  in  sociodemographic  factors  that  are  related  to  food 
consumption  and  dietary  intake  (4,  5,  6,  7).  Third,  two  statistical  tests  of  differences  between  levels  were  performed: 
(1 )  a  multiple  analysis  of  variance  (MANOVA)  with  dietary  quality  as  the  dependent  variable  and  level  of  participation 
as  the  grouping  variable  to  determine  if  nonresponse  bias  was  present  with  regard  to  participation,  and  (2)  a  multiple 
analysis  of  covariance  (MANCOVA)  to  determine  whether  controlling  for  those  factors  associated  with  dietary  quality 
that  were  found  to  be  dissimilar  among  the  levels  of  participation  would  effectively  nullify  the  bias  by  eliminating  the 
differences  observed  between  the  levels  on  the  variable  of  interest.  The  significance  of  these  specific  variables  in 
detecting  bias  could  merely  be  an  artifact  of  this  particular  sample.  Unweighted  data  were  used  for  the  analyses 
because  we  sought  assessment  of  effects  of  nonresponse  prior  to  adjustments. 


Classification  of  Households 

All  households  included  in  this  analysis  provided  satisfactory  household-level  information.  A  total  of  4,589  households 
were  available  for  this  study;  however,  316  households  were  unusable  because  no  member  of  the  household  had  10 
or  more  meals  from  the  household  food  supply  so  there  was  insufficient  information  available  to  determine  household 
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food  use.  Another  31  households  were  excluded  because  the  patterns  of  individual  participation  in  those  household 
were  too  mixed  to  categorize  them  into  a  level  as  defined  by  the  decision  rules  set  forth.  As  a  result,  92  percent  of  the 
available  sample  was  utilized  (table  12).  The  remaining  4,242  households  were  divided  into  three  levels: 

(1 )  Level  1 :  Fully  participating  (3,1 95  households).  At  least  half  of  the  individuals 
in  each  of  these  households  completed  the  1-day  recall  and  at  least  1  day  of  the 

2-  day  record.  Respondents  giving  2  days  of  individual  intake  were  grouped  with 

3-  day  respondents  because  they  did  participate  in  the  self-administered  component 
of  the  survey  even  if  only  to  a  limited  extent. 

(2)  Level  2:  1  day  of  individual  participation  (479  households).  At  least  half  of  the 
individuals  in  the  household  completed  the  1-day  recall. 

(3)  Level  3:  No  individual  participation  (568  households).  Less  than  half  of  the 
individuals  in  each  of  these  households  supplied  any  acceptable  individual  intake 
information. 

Table  12.  Classification  of  households  by  level  of  participation,  NFCS  1987-88 


Type  of  household  Number  Percent  Cumulative 

percent 


Level  1 ;  Household  component  and  2 
or  3  days  of  intake  

3,195 

69.6 

69.6 

Level  2:  Household  component  and 
1  day  of  intake  

479 

10.4 

80.0 

Level  3:  Household  component  only  

568 

12.4 

92.4 

Intakes  too  mixed  to  classify  

31 

.7 

93.1 

Unusable  household  records  

316 

6.9 

100.0 

Total  participating  households  

4,589 

When  the  same  number  of  individuals  participated  at  two  different  levels,  the  household  was  classified  at  the  higher 
(more  complete)  level. 

To  validate  that  the  individual  response  rate  was  accurately  represented  by  household  response,  a  cross-tabulation  of 
household  classification  by  actual  participating  level  of  all  individuals  in  those  households  was  performed.  The  cross- 
tabulation  revealed  that  on  the  individual  level,  92  percent  of  the  individuals  in  the  households  in  this  study  were 
correctly  classified.  In  other  words,  92  percent  of  the  individuals  completed  the  same  number  of  days-none,  one,  or 
all~as  most  of  the  other  individuals  in  the  same  household.  This  finding  agrees  with  that  found  in  the  literature:  most 
nonresponse  occurs  at  the  household  level  (28). 


Preliminary  Data  Analysis 

In  this  study,  rt  was  important  to  consider  sociodemographic  characteristics  related  to  dietary  quality  when  evaluating 
similarity  of  this  variable  among  the  three  defined  levels  of  participation.  If  differences  in  sociodemographic 
characteristics  could  solely  explain  discrepancies  found  between  the  levels,  they  could  be  used  to  predict  performance 
differences  on  dietary  quality. 

Thirteen  variables,  similar  but  not  identical  to  those  used  in  weighting  the  household  data,  have  been  found  to  be 
linked  with  dietary  quality  (6,  7,  29,  30,  31 ,  32,  33,  34,  35).  (Household  weight  variables  had  not  yet  been  determined 
when  this  study  was  initiated.)  These  variables  were  analyzed  for  differences  between  the  three  groups.  They  were 


region;  degree  of  urbanization;  last  year's  income;  Food  Stamp  Program  participation;  presence  of  child  in  the 
household  under  7  years  of  age;  presence  of  child  in  the  household  7  to  17  years  of  age;  single/dual  head(s)  of 
household;  household  size;  and  female  head's  age,  race,  ethnic  origin  (Hispanic  or  not),  education  level,  and 
employment  status.  Region  and  urbanization  were  assigned  from  information  on  the  sampling  frame.  Income  had 
been  imputed  for  596  households  using  an  ordinary  least  squares  procedure  relating  the  household  and  personal 
characteristics  available.  For  the  335  households  having  no  female  head,  characteristics  of  the  male  head  were  used. 
No  other  variables  had  missing  data. 

Categorical  variables  were  subjected  individually  to  loglinear  tests  of  independence  against  level  of  participation. 
Since  multiple  tests  were  performed,  a  Bonferroni  adjusted  significance  level  of  .004  (.05/13)  was  applied.  Continuous 
variables  were  tested  by  considering  each  in  a  one-way  analysis  of  variance  with  level  of  participation  as  the  grouping 
variable. 

Results  of  the  tests  of  differences  among  participation  levels  on  these  characteristics  indicated  that,  with  the 
exceptions  of  Food  Stamp  Program  status  and  the  ethnic  origin  and  employment  status  of  the  female  head,  all 
variables  were  significantly  different.  However,  it  should  be  noted  that  the  assumption  of  homogeneity  of  variances 
was  violated.  This  condition  threatens  the  validity  of  the  test  to  detect  real  differences  between  groups.  The  two 
smaller  groups,  Levels  2  and  3,  displayed  consistently  larger  variances  than  the  fully  participating  group  (Level  1). 

Definition  of  Dietary  Quality 

Two  types  of  information  on  food  that  could  be  used  to  measure  dietary  quality  were  available  in  the  NFCS:  household 
food  use  over  a  7-ciay  period  and  food  intake  by  individual  household  members  for  up  to  3  days.  The  household  food 
use  data  was  the  common  information  available  for  comparing  the  dietary  quality  of  the  three  different  participating 
levels.  It  is  a  plausible  assumption  to  consider  food  used  by  households  to  be  highly  related  to  actual  food  intake  of 
individuals  in  any  given  household;  therefore,  if  households  that  dropped  out  have  food  use  resembling  that  of  the  fully 
participating  households,  it  is  not  unreasonable  to  expect  their  individual  intakes  to  be  similar  to  participants'  intakes 
as  well. 

Measurement  of  a  concept  as  complex  as  dietary  quality  by  utilizing  any  single  variable  is  difficult.  Consequently, 
dietary  quality  was  operationalized  by  determining  the  nutritive  value  of  foods  used  by  the  household  expressed  as 
percentage  of  the  Recommended  Dietary  Allowances  (RDA)  for  15  nutrients  and  food  energy  adjusted  for  both  the 
number  of  meals  eaten  away  from  home  and  the  sex-age  composition  of  the  household.  The  RDA  percentages  were 
considered  collectively  in  a  multivariate  analysis. 

Statistical  Tests 

A  multiple  analysis  of  variance  (MANOVA)  was  performed  testing  for  differences  in  dietary  quality  by  level  of 
participation.  A  multiple  analysis  of  covariance  (MANCOVA)  was  then  performed  controlling  for  the  characteristics 
found  discrepant  among  the  participating  levels.  All  statistical  tests  were  conducted  using  version  4.0  of  SPSS-X  (36). 

Testing  of  assumptions  for  MANOVA  indicated  that  while  multivariate  nonnormality  did  not  appear  to  be  a  problem,  the 
assumption  of  homogeneity  of  variance-covariance  matrices  of  the  three  participating  levels  was  violated  due  to  the 
large  discrepancies  in  size  between  the  three  groups.  Bartlett's  Box-M  test  revealed  that  it  was  not  feasible  to  regard 
the  variance-covariance  matrices  as  homogeneous.  The  smaller  groups-those  that  discontinued  participation  at 
some  point-produced  significantly  larger  variances  and  covariances.  In  this  situation,  the  significance  test  is  too 
liberal,  and  whereas  the  null  hypothesis  (of  no  difference)  may  be  accepted  with  confidence  if  this  is  the  outcome, 
findings  of  differences  are  questionable  (37). 

Visual  inspection  of  the  correlation  matrix  and  Bartlett's  test  of  sphericity  were  used  to  assess  the  validity  of 
considering  all  dependent  variables  collectively  in  a  multivariate  framework.  Both  tests  determined  that  the  dependent 
variables  were  highly  correlated.  The  average  correlation  between  variables  was  0.62. 

The  multiple  analysis  of  variance  test  statistic  (Pillai's  Trace)  showed  that  there  were  significant  differences  among  the 
three  groups  on  dietary  quality  when  there  was  no  adjustment  of  relevant  factors  (p  <  .05).  Since  the  test  statistic  was 


34 


significant,  level  differences  in  mean  intakes  were  interpreted  by  examining  the  univariate  F-tests.  The  only  univariate 
test  that  reached  the  adjusted  level  of  significance  was  that  for  protein  (p  <  0.05/15  =  0.003). 

The  inequality  of  the  variance-covariance  matrices  was  also  a  problem  in  the  MANCOVA.  Bartlett's  Box-M  showed 
there  was  heterogeneity  between  the  three  groups  with  the  two  smaller  groups,  Levels  2  and  3,  exhibiting  significantly 
larger  variances  and  covariances. 

Results  of  the  effects  of  the  covariates  revealed  that  they  performed  as  expected.  Small  but  significant  contributions  to 
prediction  were  noted  both  collectively  in  a  multivariate  framework  (p  <  0.05)  and  individually  in  the  univariate  tests  as 
a  result  of  their  use  (p  <  0.004). 

The  effect  of  level  on  dietary  quality  was  not  significant  (p  =  0.25).  Households  with  no  individual  intake,  households 
with  1  day  of  intake,  and  households  with  2  to  3  days  of  intake  were  indistinguishable  on  dietary  quality  when 
measured  in  terms  of  the  Recommended  Dietary  Allowances  of  15  nutrients  and  food  energy  derived  from  foods 
measured  via  household  food  use  data  and  controlling  for  pertinent  sociodemographic  characteristics. 


Conclusion 

This  study  revealed  that  dietary  quality  differed  among  households  participating  at  various  levels  in  the  NFCS  1 987-88. 
However,  when  relevant  sociodemographic  characteristics  were  accounted  for  by  controlling  them  in  a  MANCOVA,  the 
differences  disappeared  as  expected.  These  characteristics  explained  the  disparities  in  dietary  quality  among  the 
households  participating  at  the  various  levels.  While  these  results  cannot  be  generalized  with  confidence  to  sample 
households  that  did  not  participate  in  the  survey  at  all,  they  do  support  the  use  of  this  set  of  characteristics  in  the 
nonresponse  adjustment. 


Conclusions 


The  LSRO  Expert  Panel  concluded,  and  HNIS  concurs,  that  it  is  not  possible,  based  on  the  information  available,  to 
establish  the  presence  or  absence  of  nonresponse  bias  in  NFCS  1 987-88.  However,  the  likelihood  of  such  bias  cannot 
be  disregarded.  It  is  also  not  possible  to  determine  objectively  the  extent  to  which  nonresponse  bias  might  influence 
interpretation  of  analyses  using  data  from  NFCS  1987-88.  The  panel  concluded  that  between-group  comparisons  are 
possible  but  must  be  made  with  the  recognition  that  the  respondents  may  not  be  completely  representative  of  the 
subgroups.  The  panel  also  concluded  that  use  of  the  data  for  estimates  of  specific  foods  or  food  groups,  estimates  of 
upper  percentiles  of  intake,  or  estimates  of  intakes  of  subgroups  for  which  the  cell  size  is  small  is  particularly 
questionable  (1).  Although  the  panel  focused  specifically  on  the  individual  intake  component  of  the  survey,  these 
cautions  should  be  applied  to  the  household  component  as  well. 

Although  the  possibility  of  nonresponse  bias  cannot  be  disregarded  and  the  NFCS  data  have  serious  potential  for 
error,  the  procedures  used  to  weight  the  NFCS  data  have  limited  the  potential  for  bias  as  much  as  possible.  All 
surveys  have  strengths  and  weaknesses,  and-while  the  weaknesses  of  the  NFCS  are  potentially  serious-this  should 
not  rule  out  use  of  the  data.  NFCS  1987-88  provides  the  only  current  data  available  on  household  and  individual 
food  consumption. 

The  analyses  summarized  in  this  report  and  elsewhere  (38,  39,  40)  suggest  that  NFCS  1987-88  provides  better 
estimates  of  current  dietary  intake  than  does  the  NFCS  1 977-78,  which  is  often  the  only  alternative.  The  potential 
nonresponse  bias  in  NFCS  1987-88  introduces  less  distortion  in  estimates  of  current  consumption  patterns  than  does 
the  use  of  data  collected  a  decade  earlier 

Individuals  using  NFCS  1 987-88  must  do  so  with  the  greatest  caution  and  with  a  full  understanding  of  its  limitations. 
Reports  of  findings  should  mention  the  potential  for  nonresponse  bias  and  include  a  statement  of  the  response  rates. 
Users  should  carefully  balance  their  need  and  tolerance  for  error  in  their  application  against  the  limitations. 
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FOREWORD 


The  Life  Sciences  Research  Office  (LSRO),  Federation  of  American  Societies  for  Experimental 
Biology  (FASEB),  provides  scientific  assessments  of  topics  in  the  biomedical  sciences.  Reports  are 
based  upon  literature  reviews  and  the  scientific  opinions  of  knowledgeable  investigators  engaged  in 
work  in  specific  areas  of  biology  and  medicine. 

This  report  was  developed  for  the  Human  Nutrition  Information  Service,  U.S.  Department  of 
Agriculture,  in  accordance  with  the  provisions  of  Purchase  Order  No.  43-3198-1-0154.  It  was 
prepared  by  an  ad  hoc  Expert  Panel  convened  by  LSRO  with  the  assistance  of  Sue  Ann  Anderson, 
Ph.D.,  Senior  Staff  Scientist  and  Kenneth  D.  Fisher,  Ph.D.,  Director,  LSRO.  The  members  of  the 
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Chapter  Vm. 
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background  information  on  the  1987-88  Nationwide  Food  Consumption  Survey  conducted  by  HNIS 
and  to  assess  the  impact  of  a  high  rate  of  nonresponse  on  the  dietary  data  from  that  survey.  The 
Panel  discussed  each  draft  and  the  final  report  and  provided  additional  documentation  and  view- 
points for  incorporation  into  the  final  report.  However,  the  LSRO  accepts  responsibility  for  the 
study  conclusions  and  accuracy  of  the  report;  and  the  listing  of  these  individuals  in  Chapter  VEH 
does  not  imply  that  individual  Panel  members  specifically  endorse  all  statements  in  the  report. 

The  final  report  was  reviewed  and  approved  by  the  LSRO  Advisory  Committee  (which  consists  of 
representatives  of  each  constituent  society  of  FASEB)  under  authority  delegated  by  the  FASEB  - 
Board.  Upon  completion  of  these  review  procedures,  the  report  was  approved  and  transmitted  to 
the  Human  Nutrition  Information  Service  by  the  Executive  Director,  FASEB. 

While  this  is  a  report  of  the  Federation  of  American  Societies  for  Experimental  Biology,  it  does  not 
necessarily  reflect  the  opinion  of  each  individual  member  of  the  FASEB  constituent  Societies. 
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1.  INTRODUCTION 


A.  BACKGROUND 

The  1987-88  Nationwide  Food  Consumption  Survey  (NFCS)  was  conducted  by  the  Human  Nutrition 
Information  Service  (HNIS),  U.S.  Department  of  Agriculture  (USDA),  to  provide  data  for  estimates 
of  food  consumption  by  households  and  individuals  in  the  48  conterminous  United  States.  The 
survey  was  designed  as  a  self-weighting  stratified  area  probability  sample.  The  sampling  units  were 
households  and  individuals  within  sample  hoxiseholds.  The  target  sample  was  6,000  households, 
projected  to  yield  15,000  individuals.  Further  details  about  the  sample  design  are  provided  in 
Appendix  A.  The  survey  was  conducted  under  contract  for  the  HNIS  by  National  Analysts  of 
Philadelphia,  Pennsylvania,  a  division  of  Booz,  Allen  and  Hamilton,  Inc. 

Data  collection  for  the  survey  was  planned  for  a  one-year  period  beginning  in  April,  1987.  How- 
ever, low  response  rates  became  evident  in  the  first  quarter  of  the  survey,  necessitating  that  adjust- 
ments be  made  to  increase  the  sample  size  in  subsequent  quarters.  As  described  in  Appendix  B,  the 
size  of  sample  draws  was  increased  for  the  second,  third,  and  fourth  qioarters  and  the  data  collection 
period  was  extended  for  a  fifth  qtiarter  without  an  additional  sample  of  households  being  drawn. 
Despite  these  efforts,  the  response  rate  was  only  about  38%  for  households  in  the  sxirvey  and  lower 
for  individual  participants  in  the  sampled  households  (Appendix  C). 

The  HNIS  examined  the  available  information  on  food  consumption  and  sociodemographic  charac- 
teristics of  participants  and  conducted  statistical  analyses  to  explore  the  impact  of  the  nonresponse 
on  the  estimates  based  on  these  data.  In  addition,  the  HNIS  requested  that  the  Life  Sciences 
Research  Office  (LSRO)  of  the  Federation  of  American  Societies  for  Experimental  Biology  (FASEB) 
conduct  an  independent  review  of  the  impact  of  nonresponse  on  estimates  of  food  and  nutrient 
intakes  based  on  the  data  from  the  1987-88  NFCS  and  to  make  recommendations  about  possible 
uses  of  the  data.  LSRO  convened  an  ad  hoc  Expert  Panel  consisting  of  statisticians  with  expertise 
related  to  smrvey  design  and  nonresponse  issues  to  assess  the  effects  of  nonresponse  in  the  1987-88 
NFCS.  These  individuals  are  listed  in  Chapter  VHI.  This  report  summarizes  the  analysis  of  the 
nonresponse  issues  by  the  LSRO  ad  hoc  Expert  Panel. 


B.        SCOPE  OF  WORK 

In  the  Scope  of  Work  for  this  study,  the  HNIS  specified  that  the  following  tasks  be  performed  with 
respect  to  the  1987-88  NFCS: 

•  examine  the  statistical  design  and  survey  execution  with  particular  emphasis  upon  issues 
related  to  nonresponse; 

•  review  analyses  on  nonresponse  conducted  by  the  HNIS; 

•  identify  additional  analyses  needed  to  evaluate  further  the  potential  for  nonresponse  bias  in 
the  NFCS;  and, 

•  prepare  a  report  that  simimarizes  the  findings  of  the  above  reviews  and  identifies  critical 
issues  relating  to  the  implications  of  potential  nonresponse  bias  that  the  HNIS  may  consider 
for  inclusion  in  formal  publications  of  survey  results  and  research  analyses. 
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C.        DEFINITION  OF  NONRESPONSE  BIAS 


Nonresponse  bias  is  the  difTerence  between  the  true  value  of  the  quantity  being  estimated  and  the 
expected  value  of  the  estimate  provided  by  the  respondents  to  the  survey.  Nonresponse  bias  r^-n  be 
restated  as  the  product  of  two  terms:  1)  the  difference  between  the  estimated  value  for  the 
respondents  and  the  nonrespondents  and  2)  the  propxartion  of  nonrespondents. 

A  technical  presentation  of  nonresponse  bias  follows. 

R  and  NR  are  symbols  for  response  and  nonresponse.  To  estimate  the  mean  intake  of  a  nutrient,  Y, 
the  nonresponse  bias  would  be  the  difference  betvreen  and  Y,  that  is,  (Yj^  -  Y),  where  Yj^  is  the 
mean  intake  of  all  respondents  in  the  population. 

If  P(R)  is  the  proportion  of  respondents  in.  the  population,  then  P(NR)  =  1  -  P(R)  is  the  proportion 
of  nonrespondents.  Thus, 

Y  =  Yj^P(R)  +  Yj^P(NR) 
Hence,  nonresponse  bias  is  given  by  the  expression 

%  "  ^R  ^^^^   "^NR  ^^^^^  =  "^R  f  ^  "  ^^^^^  "  ^NR  ^^^^^  " 
Yj^P(NR)-Yj^P(NR)  =  iYj^-Yj^]P(NR). 
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n.  SURVEY  DESIGN  AND  EXECUTION 


A.        RESPONSE  RATES  AT  DIFFERENT  STAGES  OF  CONTACT  WITH  HOUSEHOLDS  AND 
INDIVIDUALS 

Information  was  provided  by  the  HNIS  about  responses  rates  of  households  and  individuals  that  was 
valuable  for  tracking  the  points  at  which  nonresponse  occurred  (Appendix  C). 

The  rate  of  failure  to  make  initial  contact  was  quite  high  (greater  than  17%).  Because  caU-back 
records  were  not  kept  consistently  by  all  interviewers,  further  evaluation  of  this  problem  was  not 
possible.  The  rate  of  refusal  to  participate  in  screening  (14%)  was  not  unusual;  however,  the  rate  of 
refusal  by  those  screened  to  participate  in  the  interviews  (45%)  was  extremely  high.  In  contrast,  in 
the  1986  CSFII,  about  25%  did  not  provide,  a  usable  interview  (Tuszynski  and  Roidt,  1989).  The 
45%  refusal  rate  after  screening  was  essentially  twice  as  high  as  refusal  in  (DSFII.  The  proposed 
respondent  burden  (provision  of  household  records  plus  individual  dietary  intake  data)  and  the  lack 
of  sufficient  incentives  to  participate  may  have  contributed  to  the  low  response  rate.  In  addition, 
structural  problems  such  as  use  of  screening  and  interview  techniques  that  were  not  maximally 
effective  and  insufficient  training  and  monitoring  of  the  interviewers,  high  rates  of  turnover  of 
interviewers,  and/or  interviewers'  failure  to  follow  prescribed  schedules  may  have  occurred  in  the 
survey. 

Inspection  of  data  on  demographic  information  (age,  sex,  and  race)  (Appendix  D)  suggested  to  the 
Expert  Panel  that  some  differences  may  have  existed  between  households  that  provided  one  day's 
data,  those  that  provided  2  or  3  dajrs'  data,  and  those  that  refused  to  participate  further  after 
screening.  For  example,  86%  of  white  participants  who  were  screened  provided  one  day's  data  while 
78%  of  black  participants  provided  this  amount  of  data.  This  suggested  to  the  Expert  Panel  that 
race  may  be  a  factor  in  degree  of  participation  in  the  survey.  However,  the  lack  of  nonresponse  data 
severely  limits  any  attempt  to  compare  characteristics  of  responding  versus  nonresp>onding 
households  and  individuals. 


B.        NONRESPONSE  STUDIES 

The  LSRO  Expert  Panel  was  aware  that  the  contract  with  National  Analysts  included  a  study  of  the 
characteristics  of  nonrespondents.  Within  the  time  frame  of  this  LSRO  review,  the  contractor  did 
not  submit  the  data  or  anal3rsis  of  data  on  nonresponse  in  the  1987-88  NFCS  to  the  HNIS.  Exam- 
ination of  the  possible  influence  of  nonresponse  requires  study  at  the  time  a  survey  is  being 
conducted.  At  this  time  (1991),  conduct  of  a  retrospective  nonresponse  study  would  probably 
introduce  many  contaminants. 

A  nonresponse  study  of  characteristics  of  nonrespondents  in  the  1986  CSFII  (Tuszynski  and  Roidt, 
1989)  did  not  provide  sociodemographic  information  that  could  be  applied  to  the  present  survey 
because  only  women  19  to  50  years  of  age  and  their  children  1  to  5  years  of  age  were  included  in 
that  survey.  Little  information  on  nonresponse  was  available  from  the  1977-78  NFCS  (Appendix 
E). 


C.        WEIGHTING  SCHEME 

The  weighting  method  described  by  Loughin  and  Fuller  (1990)  is  a  reasonable  approach;  however, 
because  of  the  extremely  large  range  and  unusxial  distribution  of  the  weights  in  this  system,  the 
Expert  Panel  members  were  concerned  with  the  potential  bias  that  might  result  from  using  these 
weights. 
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Mean  weighting  factors  for  participants  in  the  survey  were  23.5  for  all  individuals,  23.4  for  all 
persons  under  20  years  of  age,  25.1  for  men  20  years  of  age  and  older,  and  22.3  for  women  20  years 
of  age  and  older  {Loughin  and  Fuller,  1990).  The  range  of  weights  in  the  existing  weighting  sj^stem 
is  very  large  (1  to  78  for  females  more  than  20  years  of  age,  1  to  130  for  males  more  than  20  years  of 
age,  and  1  to  136  for  males  and  females  less  than  20  years  of  age).  This  very  large  range  of  weights, 
as  well  as  the  unusual  distribution  of  the  weights  (p>articularly  for  females),  is  troublesome.  See 
Appendix  F  for  a  summary  of  characteristics  of  women  with  weighting  factors  greater  than  70. 

It  is  difficult  to  compare  the  ranges  in  the  weights  from  the  1987-88  NFCS  to  those  of  other  surveys 
because  of  structural  requirements  (equal  sample  representation  by  day  of  week  and  month  of  year 
for  the  NFCS)  and  because  the  one-stage  wei^ting  system  used  in  the  NFCS  did  not  permit 
contributions  of  the  nonresponse  components  to  be  separated  from  contributions  of  the  eqtiaJ  day  of 
the  week  and  month  of  the  year  reqtiirements.  Comparison  of  the  range  of  weights  from  this  sxirvey 
with  those  of  other  survejrs  such  as  the  Current  Population  Survey  (CPS)  or  the  National  Health 
Interview  Survey  (NHIS)  is  problematic  because  these  other  surveys  have  unequal  probabilities  of 
selection  and  post-stratification  adjustments  in  addition  to  nonresponse  adjustments.  Unless  the 
nonresponse  component  can  be  separated,  it  is  misleading  to  compjare  the  ranges  of  weights. 
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DDL  REVIEW  AND  EVALUATION  OF  COMPARISONS  OF  THE 
1987-88  NFCS  DATA  WITH  DATA  FROM  OTHER 
CONTEMPORANEOUS  NATIONAL  SURVEYS 


Prior  to  the  meeting  of  the  ad  hoc  Expert  Panel,  the  HNIS  prepared  a  series  of  comparisons  of  the 
1987-88  NFCS  data  to  similar  data  from  other  national  surveys  to  aid  in  answering  questions  about 
the  impact  of  nonresponse  on  the  estimates  of  food  and  nutrient  intakes.  In  addition,  socio- 
demographic  data  from  the  1987-88  NFCS  were  compared  with  similar  data  from  the  March  1987 
Current  Population  Survey  (CPS)  of  the  Bureau  of  the  Census  (Appendix  G)  and  the  Epidemiology 
Portion  of  the  Cancer  Control  Supplement  of  the  1987  National  Health  Interview  Survey  (NHIS)  of 
the  National  Center  for  Health  Statistics  (Appendix  H).  Food  consumption  and  nutrient  intake  data 
were  compared  to  three  previous  USDA  food  consumption  surveys:  the  1977-78  NFCS  and  the  1985 
and  1986  CSFH  (Appendix  D. 


A.        COMPARISON  OF  THE  1987-88  NFCS  SOCIODEMOGRAPHIC  DATA  WITH  THE 
MARCH  1987  CURRENT  POPULATION  SURVEY  ESTIMATES 

Because  the  NFCS  was  designed  to  be  self-weighting,  the  unweighted  data  should  match  the  CPS 
estimates  reasonably  well  if  there  were  no  problems  associated  with  nonresponse.  Unweighted  data 
on  thirteen  sociodemographic  variables  from  the  1987-88  NFCS  that  have  been  shown  to  be  related 
to  dietary  intake  were  compared  to  population  distributions  derived  from  the  March  1987  CPS  data 
(see  Appendix  G).  This  analysis  showed  that  there  were  statistically  significant  differences  for  the 
unweighted  NFCS  sample  relative  to  the  CPS  distribution  for  the  following  characteristics: 

•  a  larger  proportion  of  individuals  from  economically  poorer  households  and  a  smaller 
proportion  from  economically  richer  households; 

e         a  largar  proportion  of  individuals  from  households  with  two  adults; 

•  a  smaller  proportion  of  women  from  households  with  working  female  heads; 

•  a  smaller  proportion  of  men  and  women  from  households  with  a  female  head  under  41  years 
of  age  and  no  children;  and 

•  smaller  proportions  of  participants  20  to  24  years  of  age  and  15  to  19  years  of  age. 

These  findings  suggest  that  there  is  an  underrepresentation  of  nontraditional  families.  Those 
nontraditional  families  that  provided  information  are  vitally  important  because  they  are  small  in 
number  in  the  sample  and,  therefore,  heavily  weighted.  If  they  are  not  representative  of  nontradi- 
tional families,  severe  bias  could  result. 

In  addition  to  the  concerns  about  nontraditional  families,  a  question  arose  in  the  discussions  of  the 
Elxpert  Panel  about  the  designations  of  urbanization  in  the  two  surveys.  According  to  information 
in  Appendices  G  and  J,  the  categorizations  of  urbanization  were  not  the  same  for  the  NFCS  and  the 
CPS.  The  CPS  used  June  1983  designations  and  the  NFCS  used  1980  Census  designations.  The 
CPS  estimates  of  the  number  of  households  within  a  given  level  of  urbanization  were  regarded  as 
subject  to  appreciable  sampling  errors  due  to  the  nature  of  the  sampling  design.  Reweighting  the 
NFCS  individual  intake  sample  has  not  jused  urbanization  because  1)  estimates  for  relative  numbers 
of  households,  phased  on  the  NFCS  sample  and  weights  supplied  by  the  survey  contractor  (National 
Analysts,  Philadelphia)  were  considered  much  more  reliable  than  similar  estimates  for  the  relative 
numbers  of  individuals  and  2)  individual  values  were  believed  to  have  less  mathematical  dependence 
on  urbanization  than  household  survey  values  (see  Appendix  J). 
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Because  of  the  differences  in  designation  of  urbanization  between  the  NFCS  and  the  CPS,  analyses 
comparing  the  distribution  of  urbanization  of  the  NFCS  sample  to  another  contemporary  sample  are 
not  possible.  Thus,  it  cannot  be  determined  whether  the  NFCS  sample  is  representative  of  the  U.S. 
population  with  respect  to  urbanization  and  whether  the  weighting  scheme  corrected  for 
discrepancies  in  representation  of  subgroups.  In  the  NFCS  sample,  urbanization  appears  to  be  a 
factor  affecting  intake  of  food  energy  and  fiber  (see  Chapter  IV). 


B.        COMPARISONS  OF  SOaODEMOGPj^PHIC  DATA  AND  FOOD  CONSUMPTION  DATA 
FROM  1987-88  NFCS  AND  THE  1987  NfflS 

The  1987-88  NHIS,  with  22,080  respondents,  had  a  response  rate  of  about  95  percent.  Therefore, 
nonresponse  bias  was  considered  less  likely  to  be  a  factor  in  that  survey.  Some  questions  on  socio- 
demographic  variables  were  asked  in  an  identical  or  very  similar  manner  in  the  1987-88  NFCS  and 
the  1987  NHIS,  possibly  jjermitting  comparisons  of  some  sociodemographic  characteristics  (see 
Appendix  H). 

However,  a  preliminary  comparison  of  selected  characteristics  in  the  two  surveys  suggested  that 
such  comparisons  probably  introduced  another  series  of  variables.  In  reviewing  similar  questions 
asked  in  the  NHIS  and  NFCS,  the  Expert  Panel  regarded  the  questions  sufficiently  different  to  limit 
the  usefulness  of  comparisons  of  sociodemographic  characteristics  of  participjants  in  the  two  surveys. 
For  example,  the  NHIS  appeared  to  have  classified  single  adult  households  with  or  without  children 
as  one-adult  households  but  the  NFCS  classified  only  persons  hving  alone  as  single  person  house- 
holds. Differences  in  population  characteristics  with  respect  to  this  variable  appear  to  be  a  reflec- 
tion of  the  question  asked.  In  addition,  urbanization  and  Hispanicity  were  designated  differently  so 
that  direct  comparisons  cannot  be  made  for  these  variables. 

Commonalities  between  food  frequency  data  were  also  explored  as  an  area  of  overlap  between  these 
two  surveys.  In  the  NHIS,  participants  were  administered  a  subset  of  the  Block  food  frequency 
questionnaire  containing  approximately  60  food  items.  In  the  NFCS,  participants  were  asked 
questions  pertaining  to  frequency  of  consumption  of  calciimi-rich  foods.  .See  Appendix  K  for 
examples  of  questions  and  comparisons  of  intakes  from  the  two  surveys.  Some  differences  were 
observed  in  mean  intakes  of  products  in  these  categories  between  the  two  surveys  and  these  were 
difficult  to  interpret  because  of  the  differences  in  wording  of  the  questions.  For  example,  the  two 
questionnaires  differed  in  types  of  cheeses  included  in  the  "Cheese"  categories  and  in  products  added 
to  coffee  (milk  versus  milk  or  cream).  Similar  discrepancies  existed  for  most  food  items  in  the  two 
surveys. 


C.        COMPARISON  OF  NUTRIENT  LEVELS  AND  FOOD  USE  AMONG  HOUSEHOLDS 
PARTICIPATING  TO  DIFFERENT  EXTENTS  IN  THE  1987-88  NFCS 

To  explore  the  question  of  whether  nonresponse  had  an  effect  on  the  individual  intake  data  of  the 
1987-88  NFCS,  the  HNIS  compared  nutrient  levels  and  food  use  among  households  that  were 
classified  according  to  level  of  participation  (Appendix  D).  In  these  analyses,  households  that  had 
provided  responses  at  the  household  level  but  did  not  respond  to  the  individual  intake  component 
were  compared  with  households  that  responded  partially  or  fully  in  the  individual  intake 
component.  The  mean  nutritive  value  of  the  household  food  used  was  expressed  as  a  percentage  of 
the  RDAs  for  food  energy  and  15  vitamins  and  minerals.  Use  of  foods  was  mesisured  as  the  mean 
number  of  pounds  of  food  used  from  51  food  groups  and  subgroups.  Both  measures  were  adjusted 
for  the  number  of  meals  eaten  away  from  home.  Sex  and  age  composition  of  the  household  was 
considered  in  the  comparisons  with  the  RDAs. 
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Based  on  the  above,  the  HNIS  conducted  a  MANCOVA  analysis  which  compared  nutrient  levels  of 
4,242  households  classified  according  to  level  of  participation.  The  MANCOVA  did  not  show 
statistically  significant  differences  in  nutrient  intakes  by  level  of  participation  in  the  survey.  The 
Expert  Panel  considered  the  analysis  useful  for  comp>aring  households  that  substantially  completed 
the  household  interview  and  thereby  provided  some  information  about  similarities  and  differences 
among  respondents  who  participated  to  varying  degrees  in  the  stirvey.  However,  no  information  is 
available  about  characteristics  of  households  that  did  not  complete  the  screening  step  as  this 
MANCOVA  analysis  could  not  address  total  nonresponse.  As  noted  previously,  other  analyses 
already  suggested  to  the  Expert  Panel  that  differences  might  exist  between  responding  and  non- 
dinf  olds. 
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IV.  REVIEW  OF  SUPPLEMENTAL  ANALYSES  TO  ASSESS 
NONRESPONSE  IN  THE  1987-88  NFCS  REQUESTED 
BY  THE  EXPERT  PANEL 


After  considerable  discussion  of  the  information  provided  by  the  analyses  done  by  the  HNIS,  the 
Expert  Panel  requested  three  supplemental  analyses  to  assess  the  impact  of  nonresponse  on  the 
estimates  of  food  and  nutrient  intakes  from  the  1987-88  NFCS.  These  analyses  were  an  attempt  to 
determine  whether  dietary  variables  were  associated  with  the  variables  known  to  be  associated  with 
the  nonresponse. 


A.  ANOVAS 

Analyses  and  information  in  Appendices  D  and  G  established  that  a  number  of  sociodemographic 
variables  were  related  to  response  status.  This  is  of  concern  in  itself  and  of  even  more  concern  if  the 
variables  are  also  related  to  dietary  intake,  cost  of  the  diet,  and  other  uses  made  of  the  data. 
Therefore,  the  Elxpert  Panel  considered  it  important  to  look  at  a  limited  number  of  dependent 
dietary  intake  variables  using  a  simpler  approach- 
Three  hundred  sixty  one-way  analyses  of  variance  were  done  with  unweighted  data  for  intakes  of 
total  energy,  fiber,  poultry,  fluid  milk,  and  fruit  by  the  thirteen  sociodemographic  variables  con- 
trolled on  by  the  HNIS  (see  Appendix  G)  plus  several  additional  variables.  The  additional  variables 
were:  1)  weekday/weekend  day,  2)  month,  3)  living  alone  (one  adult  and  no  children),  4)  urbaniza- 
tion (central  city,  suburban,  nonmetropolitan),  5)  race  (white,  black,  other),  and  6)  ethnicity  (His- 
panic or  not).  See  Appendix  L. 


1.         Sociodemographic  variables 

Three  sociodemographic  variables  (race,  urbanization,  and  income  expressed  as  percentage  of 
poverty  level)  stood  out  consistently  as  having  an  effect  on  intake  of  food  energy  and  fiber.  These 
independent  variables  were  the  ones  that,  based  upon  examination  of  the  earlier  analyses,  the 
Expert  Panel  had  thought  might  be  important  determinants  of  response  status  in  the  survey.  This 
observation  indicates  that  the  variables  associated  with  nutrient  intake  are  also  associated  with 
nonresponse,  increasing  the  level  of  concern  about  the  possible  effects  of  nonresponse. 

In  addition,  for  subgroups  with  particularly  low  response  rates,  it  becomes  even  more  crucial  that 
the  respondents  in  these  groups  be  representative  of  the  entire  subgroup,  because  those  responses 
will  have  very  large  weights.  Without  knowing  the  representativeness  of  the  respondents  in  the 
subgroups,  it  is  not  possible  to  ascertain  whether  or  not  the  weighting  scheme  employed  has  dealt 
successfully  with  the  nonresponse  problem.  However,  if  the  responses  of  individuals  who  are 
weighted  heavily  are  unusual,  application  of  a  large  weight  will  exaggerate  the  differences  and 
compound  problems  in  interpretation  of  the  data. 

The  problem  of  considering  variance  versus  bias  in  this  situation  is  very  complicated.  If  there  were 
no  nonresponse,  concern  would  exist  about  large  weights  falling  on  individuals  who  were  in  some 
way  atypical.  This  would  be  a  variance  consideration.  Extreme  values  weighted  by  extreme  weights 
lead  to  very  large  variances  but,  if  th^  weighting  is  done  correctly,  there  is  no  bias  involved  in  this 
sitviation.  However,  when  nonresponse  is  present  and  the  weights  are  large  to  compensate  for 
missing  data,  it  is  necessary  to  rely  very  heavily  on  the  assumption  of  missing-at-random  for  the 
weighting  to  be  correct  and  valid  for  cases  with  unusual  responses.  The  Expert  Panel  has.no  basis 
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to  assume  that  the  nonrespondents  are  missing  at  random,  nor  is  the  Panel  confident  that  the 
weighting  sjrstem  employed  adjusts  completely  to  bring  the  sample  back  in  line  with  the  population. 

2.         Intake  variables 

Food  consumption  also  varied  widely  by  month  of  the  year  and  day  of  the  week.  For  example,  fruit 
intake  of  men  was  greatest  in  November  but  the  sample  size  for  that  month  w£is  the  smallest 
reported  and,  thus,  will  be  weighted  heavily.  Similarly  for  all  individuals,  poultry  consumption  was 
highest  in  August,  the  month  with  fewest  respondents.  The  occurrence  of  unvisual  values  in  months 
with  very  small  sample  sizes  causes  concern  about  nonresponse  or  noncoverage  bias.  The  small 
sample  sizes  for  different  groups  for  different  months  may  also  point  to  the  presence  of  a  structural 
problem  in  the  survey,  that  is,  the  interviewers  did  not  or  could  not  adhere  closely  to  their  sched- 
ules. 

Examination  of  differences  in  food  consumption  by  day  of  week  indicated  that  the  highest  energy 
consumption  was  reported  on  Saturdays.  Responses  were  also  lowest  for  this  day  of  the  week. 
Again,  this  is  a  situation  in  which  the  data  will  be  weighted  heavily  and  there  is  great  concern  about 
the  representativeness  of  the  respondents'  data. 

Household  size  appeared  to  be  an  important  variable  for  differences  in  food  intake;  however  for 
persons  under  20  years  of  age,  one  variable  (female  head  under  41  years  of  age  and  no  children 
under  18  years  of  age  present)  should  be  taken  out  of  the  adjustment  for  weighting  because  of  the 
small  sample  sizes. 

The  results  of  the  ANOVAs  demonstrated  that  dietary  intake  is  associated  with  the  same  variables 
that  are  related  to  response  status.  This  increased  the  Expert  Panel's  concern  about  the  potential 
for  a  sizable  nonresponse  bias  in  the  dietary  information.  In  the  next  section,  the  results  of  employ- 
ing a  weighting  procedure  are  analyzed  to  determine  whether  the  weights  may  have  successfully 
removed  the  effects  of  the  nonresponse. 


B.        COMPARISONS  OVER  TIME  OF  MEAN  INTAKES  BY  FOUR  AGE/SEX  GROUPS 

As  shown  in  Appendix  M,  mean  intakes  of  five  dietary  components  (food  energy,  protein,  poultry, 
fluid  milk,  and  fruit)  were  compared  for  children  1  to  5  years  of  age,  women  19  to  50  years  of  age 
with  a  child  (or  children)  1  to  5  years  of  age,  women  19  to  50  years  of  age  without  a  child  1  to  5 
years  of  age,  and  men  19  to  50  years  of  age  who  participated  in  the  1977-78  NFCS,  the  1985  and 
1986  CSFn,  and  the  1987-88  NFCS.  Protein  was  substituted  for  fiber  in  these  analyses  because 
fiber  intakes  were  not  available  for  the  1977-78  NFCS.  These  analyses  were  conducted  with  one- 
day  weighted  data  and  the  standard  errors  were  calculated  taking  into  account  survey  designs  and 
weighting  factors. 

Results  from  these  analyses  were  similar  to  the  results  of  analyses  shown  in  Appendix  I.  Intakes  of 
poultry  and  fruit  were  variable  and  did  not  show  any  particular  trends  over  time.  The  intakes  of 
fruit  were  so  variable  and  so  unusually  distributed  (e.g.,  highest  intakes  in  November  for  men)  that 
the  Expert  Panel  considered  those  data  uninterpretable  in  regard  to  the  purpwse  of  their  evaluation. 
Mean  intakes  of  fluid  milk  tended  to  show  less  variation  over  time. 

For  intakes  of  food  energy,  the  tables  and  bar  charts  showed  more  similar  values  for  1987-88  and 
1977-78  data  than  for  ;.987-88,  1986,  and  1985.  More  similar  and  greater  respondent  burdens  for 
the  1987-88  and  1977-78  surveys  than  for  the  other  surveys  may  have  been  partially  responsible 
for  this  finding. 

Overall,  the  intakes  of  food  energy  were  consistently  lower  in  the  1987-88  NFCS  than  in  the  1985 
and  1986  CSFII.  For  children  1  to  5  years  of  age,  energy  intake  was  lower  by  about  125  kcal,  for 
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women  19  to  50  years  of  age  with  and  without  children  age  1  to  5  years  of  age  about  100  and  120 
kcal,  and  for  men  19  to  50  years  of  age  about  300  kcai.  The  Expert  Panel  was  concerned  about  a 
drop  of  this  magnitude  in  food  energy  (which  should  reflect  total  food  intake)  over  a  short  time 
period. 

Between  1986  and  1987,  the  food  composition  database  was  updated  to  reflect  actual  changes  in  fat 
content  of  beef  resulting  from  changes  in  beef  trimming  practices.  An  analysis  of  product  changes 
between  1977-78  and  1985-86  and  database  changes  over  the  same  time  period  with  respect  to  fat 
content  of  foods  suggested  that  product  changes  made  a  greater  contribution  to  a  decrease  in  intake 
of  total  fat  observed  in  women  between  1977-78  and  1985-86  (Perloff,  1988).  Changes  in  coding 
and  probing  procedures  between  the  surveys  appeared  to  have  little  effect  on  estimated  intakes  of 
total  fat  (Guenther  and  Perloff,  1990).  Mean  intakes  of  total  fat  of  men  and  children  in  1985-86 
were  similar  to  the  1977-78  means.  Although  a  decrease  in  total  fat  intake  could  contribute  to  a 
reduction  in  total  energy  intake,  the  reason(s)  for  the  decrease  in  total  energy  intake  for  men, 
women,  and  children  1  to  5  years  of  age  participating  In  the  1987-88  NFCS  remain  to  be 
determined. 

Intakes  of  other  nutrients  might  also  be  expected  to  decrease  with  lower  food  energy  intake  and  this 
did  appear  to  be  the  case  for  protein  intakes  of  these  groups.  However,  evidence  for  this  was  not 
apparent  from  the  analjrses  of  intakes  of  other  nutrients  available  at  the  meeting  of  the  Expert 
Panel  (see  Appendix  D). 

The  Expert  Panel  considered  it  unlikely  that  the  apparent  decrease  in  food  energy  intakes  actxially 
occurred  over  a  short  time  jjeriod.  The  finding  of  lower  food  energy  intakes  may  be  interpreted  in 
several  wajrs.  As  noted  previously,  changes  in  the  food  composition  database  could  affect  estimates 
of  total  energy  intake.  The  respondent  burden  in  1987-88  NFCS  was  also  greater  than  that  in 
recent  CSFII  surveys,  but  was-  similar  to  that  of  1977-78  NFCS.  Another  interpretation  is  that  the 
weighting  system  is  not  working  or  that  the  weighting  system  is  working  and  the  differences  are 
real.  The  Expert  Panel  suggested  that  the  differences  may  reflect  problems  of  noncoverage  (day  of 
the  week  and  month  of  the  year)  and  nonresponse  that  the  weighting  procedure  has  been  unable  to 
correct. 


C.        UNIVARIATE  ANALYSIS  OF  COVARIANCE  FOR  FOOD  ENERGY  INTAKE 

A  multivariate  anal3rsis  of  covariance  provided  as  part  of  the  initial  evaluation  of  the  1987-88  NFCS 
data  showed  no  statistically  significant  overall  differences  for  intakes  of  16  nutrients,  when 
considered  jointly,  with  respect  to  level  of  participation  in  the  stirvey.  The  Expert  Panel  considered 
that  the  multivariate  approach  was  not  as  powerful  as  the  univariate  approach  since  the  effects  of 
all  16  variables  may  have  cancelled  each  other  out  even  though  group  sizes  were  large.  Because  the 
Expert  Panel  was  particularly  interested  in  intake  of  food  energy,  a  univariate  analysis  of  covariance 
that  was  focused  on  energy  intake  provided  an  alternative  and  more  powerful  means  of  evaluating 
whether  there  were  differences  in  food  intake  related  to  level  of  participation. 

The  ANCOVA  showed  that  differences  may  exist  by  degree  of  response  and  it  is  likely  that  differ- 
ences also  exist  between  respondents  and  nonrespondents.  For  example,  in  the  ANCOVA  run  for 
the  Expert  Panel,  Guenther  (1991)  reported  a  p  value  of  0.016.  The  Expert  Panel  concluded  that 
this  p  value  for  the  analysis  suggested  that  there  were  differences  in  food  energy  intake  with  respect 
to  level  of  participation.  If  differences  are  seen  among  persons  who  participated  to  different  degrees 
in  the  survey,  it  is  likely  that  there  are  differences  between  respondents  and  nonrespondents  in  the 
survey.  The  Expert  Panel  did  not  deem  it  appropriate  to  attempt  to  predict  any  trends  about  ways 
in  which  nonrespondents  might  differ  from  respondents  from  these  data  and  analyses. 


11 


V.  CONCLUSIONS 


A.        INFLUENCE  OF  NONRESPONSE  ON  THE  1987-88  NFCS  DATA 

Based  on  the  information  available,  the  Expert  Panel  concluded  that  it  is  not  possible,  with  absolute 
certainty,  to  demonstrate  either  the  presence  or  absence  of  nonresponse  bias  in  the  1987-88  NFCS 
data.  However,  the  possibility  of  nonresponse  bias  is  suggested  by  the  analyses  of  data  discussed  in 
this  report.  It  is  not  possible  to  determine  the  extent  to  which  nonresponse  bias  might  influence 
interpretation  of  analyses  using  these  data. 

The  Expert  Panel  does  not  recommend  use  of  the  data  from  the  1987-88  NFCS.  However,  if  the 
HNIS  chooses  to  publish  estimates  of  mean  consumption  of  foods,  food  groups,  or  nutrients,  the 
greatest  caution  must  be  employed.  The  HNIS  should  include  a  strongly  worded  cautionary  state- 
ment concerning  the  potential  for  nonresponse  bias  in  all  publications  of  the  1987-88  NFCS  data. 
Similarly,  the  HNIS  should  provide  the  same  information  with  all  public  releases  of  information  and 
data. 

It  is  certainly  questionable  whether  or  not  the  weighted  data  provide  unbiased  estimates  of  the 
nation's  dietary  intake.  Between  group  comparisons  are  still  possible  but  these  must  be  made  with 
the  recognition  that  the  respondents  may  not  be  completely  representative  of  the  subgroups.  Such 
estimates  cannot  be  aggregated  to  the  national  level. 

If  there  is  a  need  to  utilize  the  data  for  estimation  of  nutrient  intakes  or  cost  of  meals  or  of  food 
purchased,  further  examination  of  effects  of  combinations  of  sociodemographic  factors  should  be 
completed.  Similarly,  use  of  modified  or  alternative  weighting  schemes  might  be  explored. 

The  use  of  the  data  for  estimates  of  specific  foods  or  food  groups,  estimates  of  upper  percentiles  of 
intake,  or  estimates  of  intakes  of  subgroups  for  which  the  cell  size  is  small  is  particularly  question- 
able. Use  of  these  data  in  time  trend  analyses  in  the  future  will  always  provide  a  weak  point. 

If  the  1987-88  NFCS  data  are  used  for  estimation  of  nutrient  intakes,  sensitivity  analyses  should  be 
done  using  the  1977-78  NFCS  data.  If  the  results  are  meaningfully  different,  then  the  1985  and 
1986  CSFU  data  should  be  used  to  see  if  there  is  a  trend  that  would  support  the  difference.  If  there 
is  not,  the  1987-88  NFCS  data  should  not  be  used. 


B.        WEIGHTING  SCHEMES 

The  Expert  Panel  concluded  that  it  is  questionable  whether  any  adjustment  system  can  rectify  the 
nonresponse  and  possible  noncoverage  (day  of  the  week  and  month  of  the  year)  of  the  survey.  The 
results  of  the  ANOVAs  showing  differences  in  food  intakes  related  to  a  number  of  sociodemographic 
variables  magnifies  the  importance  of  assumptions  concerning  representativeness  of  the  respon- 
dents. Hence,  because  of  the  high  level  of  nonresponse,  the  Expert  Panel  is  of  the  opinion  that  no 
weighting  procedure  could  give  one  confidence  that  it  had  dealt  successfully  with  the  low  response 
rate. 

The  Expert  Panel's  concerns  about  the  effects  of  nonresponse  (and  noncoverage)  were  exacerbated 
by  the  lack  of  data  on  nonresponse.  Without  a  nonresponse  study  that  includes  some  information  on 
food  consumption,  there  is  no  way  to  know  whether  weighting  schemes  or  any  other  type  of  adjust- 
ment can  account  for  the  problem  of  differences  in  respondents  and  nonrespxindents. 
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VI.  RECOMMENDATIONS  FOR  FURTHER  CONSIDERATION 


A-       ADDITIONAL  ANALYSES 

The  Expert  Panel  determined  that  other  analyses  of  the  available  data  would  not  provide  much 
assistance  in  reaching  conclusions  about  the  effects  of  nonresponse  on  the  dietary  data  collected  in 
the  1987-88  NFCS;  however,  two  sets  of  analjrses  might  be  of  interest  to  the  HNIS. 

•  The  food  intakes  of  persons  with  combinations  of  sociodemographic,  day,  and  month  charac- 
teristics that  result  in  very  high  weights  should  be  examined  to  see  if  they  are  unusual. 

•  Results  from  the  ANOVAs  and  ANCOVA  suggest  that  simpler  analyses  may  provide  infor- 
mation about  factors  that  influenced  food  consumption  of  participants  in  the  1987-88  NFCS. 
For  example,  it  may  be  useful  to  the  HNIS  to  examine  intakes  of  food  components  by 
combinations  of  limited  numbers  of  sociodemographic  variables.  These  types  of  analyses 
should  be  done  to  investigate  effects  of  combinations  of  independent  variables  on  food 
consumption  and  might  suggest  particular  variables  to  be  used  in  an  adjustment  cell  process. 


B.        WEIGHTING  SCHEMES 

The  Expert  Panel  suggested  that  alternate  approaches  for  weighting  be  considered.  Since  the  day 
and  month  variables  played  a  major  role  in  the  creation  of  the  large  weights  in  the  present  weight- 
ing scheme  G^oughin  and  Fuller,  1990),  a  two-step  approach  for  weighting  that  allows  for  separa- 
tion of  survey  effects  from  nonresponse  effects  may  be  useful.  Alternatively,  a  weighting  approach 
utilizing  adjustment  cells  would  generate  weights  adjusted  for  a  smaller  number  of  variables.  These 
approaches  are  not  inherently  better  than  the  approach  already  taken,  but  the  weights  generated 
may  have  a  narrower  range  and  there  is  less  risk  of  extremely  high  weights  being  applied  to  unusvial 
intake  values. 


C.        IMPROVEMENT  OF  RESPONSE  RATE 

1,         Monetary  incentives  to  improve  response  rate 

The  offer  of  remuneration  has  been  shown  to  improve  response  rate  in  other  surveys  (Berry  and 
Kanouse,  1987;  Cook  et  al.,  1985;  Godwin,  1979;  Hubbard  and  Little,  1988;  James  and  Bolstein, 
1990;  Mizes  et  al.,  1984).  To  reduce  the  percentage  of  households  that  refused  the  screener  (14%) 
and  the  percentage  of  screened  households  that  refused  the  interview  (45%),  the  Expert  Panel 
suggested  that  a  monetary  incentive  be  offered  at  the  time  of  the  initial  contact.  Results  of  Berry 
and  Kanouse  (1987)  suggest  a  $20.00  check  made  payable  to  the  person  in  the  household  responsible 
for  meal  planning^purchasing  can  be  effective.  This  is  a  small  fraction  of  the  $1000.00  cost  per 
household  of  surveys  such  as  the  NFCS  [cost  of  the  1987-88  NFCS  ($7,529, 123.)/7,285  participating 
households].  Persons  should  be  told  that  they  will  receive  their  check  in  the  mail  with  a  letter 
confirming  their  appointed  interview.  They  should  not  be  promised  the  check  upon  completing  the 
interview  becatise  promises  have  been  shown  not  to  worL  The  goal  of  the  incentive  is  to  establish  a 
bond  of  trust  and  goodwill  with  the  respondent  (Berry  and  Kanouse,  1987).  Additionally,  to  increase 
individual  intake  response  rates,  consideration  should  be  given  to  providing  incentives  to  each 
individual  family  member  (suggested  ainoiint  $5.00)  at  the  time  of  the  first  household  visit.  Checks 
(no  cash)  should  be  made  out  to  each  household  member  personally.  Because  there  is  no  literature 
on  giving  incentives  during  household  visits,  the  latter  incentive  might  be  given  to  a  randomly 
selected  test  group  in  the  first  quarter  to  see  whether  or  not  it  significantly  improves  response  rate 
compared  to  the  control  group  befoi^  being  instituted  in  subsequent  quarters. 
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2.         Procedural  aspects 


It  appeared  that  insufficient  training  and  monitoring  of  interviewers,  possible  high  rates  of  turnover 
of  interviewers,  and/or  interviewers'  failure  to  follow  prescribed  schedules  may  have  contributed  to 
the  low  response  occurring  in  the  survey.  The  need  for  improved  management  of  personnel  is  an 
impxirtant  consideration  for  improvement  of  response  rate. 


3.         Respondent  burden 

The  Expert  Panel  noted  that  the  respondent  burden  for  participants  in  the  NFCS  was  very  great 
and  that  it  may  have  influenced  the  decision  to  participate  at  both  the  household  and  individual 
levels.  Modification  of  the  survey  design  and  instruments  to  lighten  respondent  burden  may  also 
improve  response  in  future  surveys. 


D.        NONRESPONSE  STUDIES 

Studies  of  nonresponse  should,  if  possible,  include  collection  of  information  on  the  major  data 
elements  of  the  survey.  To  date,  nonresponse  studies  for  national  food  consumption  surveys  have 
included  only  data  on  sociodemographic  characteristics  of  nonrespondents.  For  example,  the 
nonresponse  study  for  the  1986  CSFII  compared  sociodemographic  characteristics  of 
nonrespondents  with  respondents.  This  procedure  did  provide  some  basis  for  comparison  and 
examination  of  nonresponse. 

It  is  recognized  that  collection  of  data  on  food  consumption  from  nonrespondents  may  be 
problematic,  however,  in  the  future,  nonresponse  studies  should,  if  possible,  include  questions  on 
both  sociodemographic  factors  and  food  consumption  to  provide  a  more  appropriate  basis  for 
comparison  of  nonrespondents  with  respondents. 
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